This file describes Multispeech interface
as a multilingual speech server for Emacspeak.

To use Multispeech with Emacspeak you should make symlink named
"multispeech" pointing to it's executable in the directory where
Emacspeak's speech servers normally live.

Being started Multispeech accepts commands on it's standard input. All
these commands are in the form:

command [arguments]

Each command starts from a new line, but arguments may spread on
several lines. In that cases arguments must be enclosed in curly
braces. When a command with all it's arguments occupies just one line
these braces are optional. Some commands have no arguments at all.


The following commands are recognized and served by Multispeech:

q text
  Place specified text into the server queue.

a file
  Place specified file into the server queue.

t frequency duration
  Place tone signal into the server queue. The tone signal is
  specified by it's frequency in Hz and duration in milliseconds. If
  duration is omitted then 50 milliseconds is used by default. If no
  arguments are specified at all then beep of 440 Hz and 50
  milliseconds will be produced.

sh duration
  Place silence of specified duration into the server queue. Duration
  is specified in milliseconds. If no duration is specified then 50
  milliseconds is used by default.

d
  Proceed the server queue. Do nothing if it is proceeding at the
  moment.

tts_pause
  Immediately stop all current sound activities and save the server
  queue.

tts_resume
  Immediately stop all current sound activities, restore the server
  queue and proceed it.

tts_say text
  Immediately stop all current sound activities and say specified text
  attempting to treat it as a character name, but applying general
  speech parameters.

l text
  Say specified text treating it as a character name. Current speech
  (if any) will be immediately stopped. If sounds and tones are not
  asynchronous they will be stopped too.

p file
  Play specified sound file.

s
  Immediately stop all current sound activities.

r rate
tts_set_speech_rate rate
  Set specified speech rate. The value is treated according to
  Emacspeak conventions. This command does not affect currently
  playing or queued speech. New value is applied to the subsequent
  items.

set_lang language say_it
  Change speaking language. Allowed values are "en" for English, "ru"
  for Russian, "de" for German, "fr" for French, "es" for Spanish,
  "pt" for Portuguese, "it" for Italian and "autodetect" for automatic
  language detection by the text nature. The second argument is
  optional. If it is specified and it's value is not "nil" then newly
  selected language name will be spoken.

set_next_lang say_it
  Switch language forward to the next available one. Argument is
  optional. If it is specified and is not "nil" then new language name
  will be spoken.

set_previous_lang say_it
  Switch language backward to the next available one. Argument is
  optional. If it is specified and is not "nil" then new language name
  will be spoken.

tts_set_punctuations mode
  Change punctuations speaking mode. Allowed values are "all", "some"
  and "none". Only the first character is checked in the mode
  specification.

tts_set_character_scale value
  Change voice pitch for single letters pronunciation.

tts_capitalize flag
  Enable/disable "capitalize" mode. The value 1 or 0 is
  expected. Actually do nothing. This functionality is not implemented
  in fact.

tts_split_caps flag
  Enable/disable "split caps" mode. Value 1 or 0 is expected. Actually
  do nothing. This functionality is not implemented in fact.

tts_sync_state punctuation_mode capitalize_mode split_caps_mode speech_rate
  Set specified speech parameters.

c voice_spec
  Set voice for subsequent speech in queue. Reset it to default
  when voice_spec is empty or invalid.

tts_reset
  Stop all current sound activities and reset all general speech
  parameters to the default state.

version
  Say the Multispeech version.

exit
  Exit Multispeech.


Embedded speech parameters.

The text passed as an argument for the "q" and "tts_say" commands may
include a special sequence That allows to alter some speech parameters
for that text chunk locally leaving global settings untouched. This
sequence should be placed at the very beginning of the text. Just the
same syntax is used in command "c" to specify voice for the subsequent
speech chunks in queue.

When native voices are enabled in the Multispeech configuration, the
embedded sequences of the form "[_: key:value ...]" are recognized.

Each "key:value" pair represents local
value for the corresponding parameter. The recognized keys are as
follows:

vo -- speech volume;
pi -- voice pitch;
ra -- speech rate;
fr -- sampling frequency;
pu -- punctuations verbosity mode.

Volume, pitch and rate are specified relatively to the corresponding
parameters obtained from the Multispeech configuration, but rate is
scaled as it is assumed according to the Emacspeak
conventions. Sampling frequency is specified in assumption that it is
under TTS engine control and the normal value is 16000 Hz, but
Multispeech tries to recalculate and apply it properly to achieve
equivalent deviation in all circumstances. Thus, the embedded speech
parameters "[_: vo:1.0 pi:1.0 ra:200 fr:16000]" actually result in
normal speech just as it was configured.

For punctuations verbosity mode three values are recognized: all, some
and none.

When DECtalk voices are enabled, Multispeech recognizes embedded
parameters in the DECtalk notation and tries to emulate some subset of
DECtalk voice control capabilities. see Fonix Corporation web site at
http://www.fonix.com for detailed information about DECtalk speech
synthesizer and it's API. Multispeech supports following DECtalk
commands:

:np
:nh
:nf
:nd
:nb
:nu
:nw
:nr
:nk
:nv
:dv
:ra[te]
:volu[me] set
:pu[nc[t]]

Here optional parts are enclosed in brackets as usual.

When designing voice with :dv command, following options are
supported:

ap
hs
pr
save

For punctuations mode three values are recognized: all, some and none.

Of course, it is pretty small and simple subset, but it gives us
several distinguishable voices and some basic control means. In fact,
the DECtalk voice parameters variations are reflected by the
corresponding variations of the Multispeech voice parameters. Thus,
pitch range affects sampling frequency deviation, while average pitch
and head size influence voice pitch. If both average pitch and head
size are specified, then average pitch takes precedence.