1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202
|
This file describes Multispeech interface
as a multilingual speech server for Emacspeak.
To use Multispeech with Emacspeak you should make symlink named
"multispeech" pointing to it's executable in the directory where
Emacspeak's speech servers normally live.
Being started Multispeech accepts commands on it's standard input. All
these commands are in the form:
command [arguments]
Each command starts from a new line, but arguments may spread on
several lines. In that cases arguments must be enclosed in curly
braces. When a command with all it's arguments occupies just one line
these braces are optional. Some commands have no arguments at all.
The following commands are recognized and served by Multispeech:
q text
Place specified text into the server queue.
a file
Place specified file into the server queue.
t frequency duration
Place tone signal into the server queue. The tone signal is
specified by it's frequency in Hz and duration in milliseconds. If
duration is omitted then 50 milliseconds is used by default. If no
arguments are specified at all then beep of 440 Hz and 50
milliseconds will be produced.
sh duration
Place silence of specified duration into the server queue. Duration
is specified in milliseconds. If no duration is specified then 50
milliseconds is used by default.
d
Proceed the server queue. Do nothing if it is proceeding at the
moment.
tts_pause
Immediately stop all current sound activities and save the server
queue.
tts_resume
Immediately stop all current sound activities, restore the server
queue and proceed it.
tts_say text
Immediately stop all current sound activities and say specified text
attempting to treat it as a character name, but applying general
speech parameters.
l text
Say specified text treating it as a character name. Current speech
(if any) will be immediately stopped. If sounds and tones are not
asynchronous they will be stopped too.
p file
Play specified sound file.
s
Immediately stop all current sound activities.
r rate
tts_set_speech_rate rate
Set specified speech rate. The value is treated according to
Emacspeak conventions. This command does not affect currently
playing or queued speech. New value is applied to the subsequent
items.
set_lang language say_it
Change speaking language. Allowed values are "en" for English, "ru"
for Russian, "de" for German, "fr" for French, "es" for Spanish,
"pt" for Portuguese, "it" for Italian and "autodetect" for automatic
language detection by the text nature. The second argument is
optional. If it is specified and it's value is not "nil" then newly
selected language name will be spoken.
set_next_lang say_it
Switch language forward to the next available one. Argument is
optional. If it is specified and is not "nil" then new language name
will be spoken.
set_previous_lang say_it
Switch language backward to the next available one. Argument is
optional. If it is specified and is not "nil" then new language name
will be spoken.
tts_set_punctuations mode
Change punctuations speaking mode. Allowed values are "all", "some"
and "none". Only the first character is checked in the mode
specification.
tts_set_character_scale value
Change voice pitch for single letters pronunciation.
tts_capitalize flag
Enable/disable "capitalize" mode. The value 1 or 0 is
expected. Actually do nothing. This functionality is not implemented
in fact.
tts_split_caps flag
Enable/disable "split caps" mode. Value 1 or 0 is expected. Actually
do nothing. This functionality is not implemented in fact.
tts_sync_state punctuation_mode capitalize_mode split_caps_mode speech_rate
Set specified speech parameters.
c voice_spec
Set voice for subsequent speech in queue. Reset it to default
when voice_spec is empty or invalid.
tts_reset
Stop all current sound activities and reset all general speech
parameters to the default state.
version
Say the Multispeech version.
exit
Exit Multispeech.
Embedded speech parameters.
The text passed as an argument for the "q" and "tts_say" commands may
include a special sequence That allows to alter some speech parameters
for that text chunk locally leaving global settings untouched. This
sequence should be placed at the very beginning of the text. Just the
same syntax is used in command "c" to specify voice for the subsequent
speech chunks in queue.
When native voices are enabled in the Multispeech configuration, the
embedded sequences of the form "[_: key:value ...]" are recognized.
Each "key:value" pair represents local
value for the corresponding parameter. The recognized keys are as
follows:
vo -- speech volume;
pi -- voice pitch;
ra -- speech rate;
fr -- sampling frequency;
pu -- punctuations verbosity mode.
Volume, pitch and rate are specified relatively to the corresponding
parameters obtained from the Multispeech configuration, but rate is
scaled as it is assumed according to the Emacspeak
conventions. Sampling frequency is specified in assumption that it is
under TTS engine control and the normal value is 16000 Hz, but
Multispeech tries to recalculate and apply it properly to achieve
equivalent deviation in all circumstances. Thus, the embedded speech
parameters "[_: vo:1.0 pi:1.0 ra:200 fr:16000]" actually result in
normal speech just as it was configured.
For punctuations verbosity mode three values are recognized: all, some
and none.
When DECtalk voices are enabled, Multispeech recognizes embedded
parameters in the DECtalk notation and tries to emulate some subset of
DECtalk voice control capabilities. see Fonix Corporation web site at
http://www.fonix.com for detailed information about DECtalk speech
synthesizer and it's API. Multispeech supports following DECtalk
commands:
:np
:nh
:nf
:nd
:nb
:nu
:nw
:nr
:nk
:nv
:dv
:ra[te]
:volu[me] set
:pu[nc[t]]
Here optional parts are enclosed in brackets as usual.
When designing voice with :dv command, following options are
supported:
ap
hs
pr
save
For punctuations mode three values are recognized: all, some and none.
Of course, it is pretty small and simple subset, but it gives us
several distinguishable voices and some basic control means. In fact,
the DECtalk voice parameters variations are reflected by the
corresponding variations of the Multispeech voice parameters. Thus,
pitch range affects sampling frequency deviation, while average pitch
and head size influence voice pitch. If both average pitch and head
size are specified, then average pitch takes precedence.
|