|
Having been among the very
first players to adopt the Unit Selection technique,
Loquendo is now the first company to bring
expressive synthetic speech to the market.
Loquendo's voices have been enriched with a
repertoire of "expressive cues", which
enable TTS users to enliven their voice prompts.
This new development signals an important landmark
in the pursuit of expressive synthetic speech.
Developing both stylistically
and emotionally expressive synthetic speech
has been one of Loquendo's objectives for several
years; research has covered both language modelling
and techniques to make the TTS software more
manageable, without compromising on the natural
qualities achieved through Unit Selection. In
the near future, research will make it possible
to synthesize any kind of text in a specific
style (e.g. emphatic, formal, informal, etc.)
and in the emotional tone required (e.g. happy,
sad, angry, etc.).
The speech market does not currently offer solutions
that are capable of changing the style and tone
of a voice, upon command or according to the
nature of the text itself, without compromising
on acoustic quality. Loquendo, however, is now
offering its customers the chance to make their
vocal messages truly lifelike and expressive.
Just like human conversation, expressive intention
is conveyed through conventional phrases and
exclamations, which are pronounced with a natural
and colourful intonation. As a result, the entire
message becomes far more expressive.
Example
1:
Welcome to Loquendo's TTS system. You can
write anything you like, and I'll say it.
Amazing! Just imagine it! It's
not so hard, is it? I can say: fantastic.
But it sound much better like this: Fantastic!
Congratulations. Oh, sorry! Congratulations!
I must go now. Thanks to you all!
Loquendo's repertoire of "expressive cues"
contains conventional figures of speech, such
as greetings and exclamations ("hello!",
"oh no!", "thank you!"),
interjections ("Oh!", "Well!",
"Hmm!") and paralinguistic events
(e.g. breathing, coughing, laughter, etc.),
suggesting additional layers of expressive intent
(confirmation, doubt, gratitude, etc.).
Example
2:
Here's an example. Let's imagine we're arguing.
If I say: no, nonsense. That's unbelievable.
No, you wouldn't believe me. But if I say
it like this: No! Nonsense! That's unbelievable!
Doh! It's sound much better. What
do you think? Mmm, it's more believable. Don't
you think so? Cheerio!
In order to achieve the greatest
degree of naturalness, the same phrase can be
pronounced using different styles and intonations,
from neutral to emphatic, from sad to amazed.
Example
3:
\_Throat Perfect! I can speak
almost like a human being. Fantastic. Or would
you prefer to hear me say: Fantastic!
\_Laugh Now you can choose. Hello everybody,
or: Hello everybody! Thank you
so much, see you tomorrow, or: Thank you
so much, see you tomorrow!
Example 4: Using
Emphatic expressions
Hello and welcome! My name is Simon. I'm the
English Loquendo TTS voice. Now I can sound
much more natural. For example, I can say:
it's a pleasure. But I can also say it like
this: It's a pleasure!
The 'Expressive Cues' can be
typed directly into the text with the appropriate
punctuation, or can be selected from an easy
to use drop-down menu in which the user will
find the full repertoire of "expressive
cues" available for each voice. The list
is structured according to intuitive linguistic
categories, so that appropriate formulas can
be rapidly and easily found. In this way messages,
dialogue, and narratives can be easily constructed
and read by Loquendo's synthetic voices in a
truly expressive and natural sounding way.
|