1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249
|
EEEEEE EEEEEEE 1
E E 1
E E 1
E E 1
EEEE EEEE 1
E E 1
E E 1
E E 1
EEEEEEE EEEEEEE 1 release 07 April 2002
--------------------------------------------------------------
Table of Contents
--------------------------------------------------------------
1.0 A brief description of the EE1 database
2.0 Distribution
3.0 Installation, and tests
4.0 Author
------------------------------------------------------------
1.0 A brief description of EE1
------------------------------------------------------------
EE1 release is a Standard Estonian diphone database provided in the
context of the MBROLA project: http://tcts.fpms.ac.be/synthesis
It provides an Estonian male voice to be used with the MBROLA program.
Input files use the Estonian orthographic notation. The Estonian
orthography is not phonetic. Some essential phonological oppositions
as well as phonologically non-relevant phonetic facts (but important
from the point of view of orthoepy and speech naturality) are not
exposed in the written form of Estonian.
The sound system of Standard Estonian comprises 9 vowel and 17
consonant phonemes, all of which can occur as short (single) or long
(double vowels and geminate consonants). In Estonian there are two
segmental lengths (short/long) and three phonologically distinctive
foot patterns - quantity degrees: Q1, Q2 and Q3.
Below is a list of the Standard Estonian phonemic units in the SAMPA
transcription with examples in the Estonian orthography.
SAMPA SAMPA Orthography
symbol transcription
i kilu kilu (Q1) 'sprat, nom.sg.'
ii kiilu kiilu (Q2) 'wedge, gen.sg.'
kii:lu kiilu (Q3) 'wedge, part.sg.'
e ketA keda (Q1) 'whom'
ee keetA keeda (Q2) 'boil, 2.sg.imperat.'
kee:tA keeda (Q3) 'boil, da-inf.'
{ k{ru kru (Q1) 'barrow, nom.sg.'
{{ k{{ru kru (Q2) 'crook, gen.sg.'
k{{:ru kru (Q3) 'crook, part.sg.'
y myrin mrin (Q1) 'rumble, nom.sg.'
yy myyri mri (Q2) 'wall, gen.sg.'
myy:ri mri (Q3) 'wall, part.sg.'
2 l2mA lma (Q1) 'squash, nom.sg.'
22 l22mA lma (Q2) 'beating, gen.sg.'
l22:mA lma (Q3) 'to beat, ma-inf.'
u kuri kuri (Q1) 'evil, nom.sg.'
uu kuuri kuuri (Q2) 'shed, gen.sg.'
kuu:ri kuuri (Q3) 'shed, part.sg.'
o pori pori (Q1) 'mud, nom.sg.'
oo poori poori (Q2) 'pore, gen.sg.'
poo:ri poori (Q3) 'pore, part.sg.'
7 k7mA kma (Q1) 'boom, nom.sg.'
77 k77mA kma (Q2) 'dandruff, gen.sg.'
k77:mA kma (Q3) 'dandruff, part.sg.'
A sAtA sada (Q1) 'hundred, nom.sg.'
AA sAAtA saada (Q2) 'send, 2.sg.imperat.'
SAA:tA saada (Q3) 'get, da-inf.'
The vowel /7/ represents non-low back unrounded vowels.
P tApA taba (Q1) 'padlock, nom.sg.'
pp tAppA tapa (Q2) 'kill, 2.sg.imperat.'
tAp:pA tappa (Q3) 'kill, da-inf.'
t pAtu padu (Q1) 'a low wet place, nom.sg.'
tt pAttu patu (Q2) 'sin, gen.sg.'
pAt:tu pattu (Q3) 'sin, part.sg.'
t' pAt'i padi (Q1) 'pillow, nom.sg.'
t't pAt'ti pati (Q2) 'stalemate, gen.sg.'
pAt':ti patti (Q3) 'stalemate, part.sg.'
k kAku kagu (Q1) 'southeast, nom.sg.'
kk kAkku kaku (Q2) 'loaf, owl, gen.sg.'
kAk:ku kakku (Q3) 'loaf, owl, part.sg.'
f foori foori (Q2) 'traffic lights, gen.sg.'
ff tuffi tufi (Q2) 'tufa, gen.sg.'
tuf:fi tuffi (Q3) 'tufa, part.sg.'
v kAvA kava (Q1) 'plan, nom.sg.'
vv sAvvA Savva (Q2) 'Savva, name, nom.sg.'
kAv:vA kavva (Q3) 'plan, illat.sg.'
s m{su msu (Q1) 'tumult, nom.sg.'
ss m{ssu mssu (Q2) 'revolt, gen.sg.'
m{s:su mssu (Q3) 'revolt, part.sg.'
s' kAs'i kasi (Q1) 'clean, 2.sg.imperat.'
s's kAs'si kassi (Q2) 'cat, gen.sg.'
kAs':si kassi (Q3) 'cat, part.sg.'
S Seffi efi (Q2)'chief, gen.sg.'
looSi loozi (Q2) 'loge, gen.sg.'
SS tuSSi tui (Q2) 'Indian ink, gen.sg.'
tuS:Si tui (Q3) 'Indian ink, part.sg.'
h sAhin sahin (Q1) 'rustle, nom.sg.'
hh SAhhi ahhi (Q2) 'shah, gen.sg.'
SAh:hi ahhi (Q3) 'shah, part.sg.'
m sAmu samu (Q1) 'same, part.pl.'
mm sAmmu sammu (Q2) 'step, gen.sg.'
sAm:mu sammu (Q3) 'step, part.sg.'
n kAnu kanu (Q1) 'hen, part.pl.'
nn kAnnu kannu (Q2) 'jug, gen.sg.'
kAn:nu kannu (Q3) 'jug, part.sg.'
n' pAn'i pani (Q1) 'put, 3.sg.past'
n'n pAn'ni panni (Q2) 'pan, gen.sg.'
pAn':ni panni (Q3) 'pan, part.sg.'
l kAlAs kalas (Q1) 'fish, iness.sg.'
ll kAllAs kallas (Q2) 'shore, nom.sg.'
kAl:lAs kallas (Q3) 'pour, 3.sg.past'
l' pAl'i pali (Q1) 'tub, nom.sg.'
l'l pAl'li palli (Q2) 'ball, gen.sg.'
pAl':li palli (Q3) 'ball, part.sg.'
r nAri nari (Q1) 'plank bed, nom.sg.'
rr nArri narri (Q2) 'fool, gen.sg.'
nAr:ri narri (Q3) 'fool, part.sg.'
j mAjA maja (Q1) 'house, nom.sg.'
jj mAjjA majja (Q3) 'house, illat.sg.'
Consonants with /'/ are palatalized.
Q1 and Q2 need no special marking. Natural-sounding Q2 is generated
from Q1 by doubling the duration of the respective short phonemes in
Q1 and changing the duration of the stressed and unstressed syllables
to the needed duration ratio, and by determining the location of the
F0 peak in the stressed syllable.
We mark Q3 by a colon placed after the peak of the Q3 foot. All Q3
feet are either vowel-peaked or consonant-peaked. The colon does not
denote a Q3 phoneme but indicates that the whole foot is in the Q3. At
the same time the colon signalises the duration increment of the
preceding syllable-final phoneme.
In order to distinguish the qualitatively strongly reduced vowels of
an unstressed syllable of the Q3 foot and the corresponding vowels of
the Q1 and Q2 feet, we mark the vowels of an unstressed syllable of a
Q3 foot by number 3 (i3, e3,... etc.) in the diphone database.
About detailed description of the Estonian diphone database see:
M.Mihkla, A.Eek, E.Meister, Creation of the Estonian diphone database
for text-to-speech synthesis. - Proceedings of the Finnic Phonetics
Symposium, August 11-14, 1998, Prnu, Estonia. Linguistica Uralica
XXXIV, 1998, 3: 334-340.
Limitations:
-----------
EE1 diphone database is currently not fully completed. The full
diphone database will be available after the test period currently
going on at the authors' laboratories.
For the full text-to-speech we need to use additional programs
(morphologic analysis of input text, determination of Q3 feet, adding
stress marks and syllable boundaries, etc.) developed by the
researchers from the Institute of Estonian Language (Tallinn) and from
Filosoft Ltd. (Tartu).
--------------------------------------------------------------
2.0 Distribution
--------------------------------------------------------------
This distribution of mbrola contains the following files :
ee1 : the database itself
ee1.txt : This file
license.txt: must read before using the database
and example .PHO files:
tere.pho
example.pho
Additional languages and voices, as well as other example command
files, are or will be available in the context of the MBROLA
project. Please consult the MBROLA project homepage :
http://tcts.fpms.ac.be/synthesis
Registered users will automatically be notified of the availability of
new databases. To freely register, simply send an email to
mbrola-interest-request@tcts.fpms.ac.be with the word 'subscribe' in
the message title.
--------------------------------------------------------------
3.0 Installation and Tests
--------------------------------------------------------------
If you have not copied the MBROLA software yet, please consult the
MBROLA project homepage and get it.
Copy ee1.zip into the mbrola directory and unzip it :
unzip ee1.zip (or pkunzip on PC/DOS)
(don't forget to delete the .zip file when this is done)
On PC-Windows register ee1 with the Wizard in the control panel.
On unix platforms, try:
mbrola ee1 TEST/example.pho example.wav
to create a sound file. In this example the audio file follows the
RIFF Wav format. But depending on the extension example.au,
example.aif, or example.raw you can obtain other file formats. Listen
to it with your favorite sound editor, and try the other command files
(*.pho) to have a better idea of the quality of speech you can
synthesize with the MBROLA technique.
On Unix systems you can pipe the audio ouput to the sound player as on
a HP : mbrola ee1 TEST/example.pho - | splayer -srate 16000 -l16
Also refer to the readme.txt file provided with the mbrola software
for using it.
--------------------------------------------------------------
4.0 Author
--------------------------------------------------------------
This database was recorded by:
Einar Meister, M.Sc. and Arvo Eek, Dr.Phil. from the
Laboratory of Phonetics and Speech Technology
Institute of Cybernetics
Akadeemia tee 21
Tallinn EE0026
ESTONIA
phone: +372 6204200 fax: +372 6397039 email: einar@ioc.ee
and
Meelis Mihkla
Institute of Estonian Language
Roosikrantsi 6
Tallinn 10119
ESTONIA
phone: +372 2 443564 fax: +372 6411443 email: meelis@eki.ee
For general information, questions on the installation of software and
databases, contact mbrola@tcts.fpms.ac.be
|