1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308
|
<?xml version='1.0' encoding='utf-8' ?>
<!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
<!ENTITY % BOOK_ENTITIES SYSTEM "Users_Guide.ent">
%BOOK_ENTITIES;
]>
<section id="sect-Users_Guide-Entities_and_translation">
<title>Entities and translation</title>
<warning>
<title>Use entities with extreme caution</title>
<para>
Entities offer convenience but they should be used with extreme caution in documents that will be translated. Writing (for example) <sgmltag>&FDS;</sgmltag> instead of <application>Fedora Directory Server</application> saves the writer time but transformed entities do not appear in the <firstterm>portable object</firstterm> (PO) files that translators use. Complete translations of documents containing entities are, as a consequence, impossible.
</para>
</warning>
<para>
Entities present special obstacles to translators and can preclude the production of high-quality translations. The very nature of an entity is that the word or phrase represented by the entity is rendered exactly the same way every time that it occurs in the document, in every language. This inflexibility means that the word or word group represented by the entity might be illegible or incomprehensible in the target language and that the word or word group represented by the entity cannot change when the grammatical rules of the target language require them to change. Furthermore, because entities are not transformed when XML is converted to PO, translators cannot select the correct words that surround the entity, as required by the grammatical rules of the target language.
</para>
<para>
If you define an entity — <sgmltag><!ENTITY LIFT "Liberty Installation and Formatting Tome"></sgmltag> — you can enter <literal>&LIFT;</literal> in your XML and it will appear as <literal>Liberty Installation and Formatting Tome</literal> every time the book is built as HTML, PDF or text.
</para>
<para>
Entities are not transformed when XML is converted to PO, however. Consequently, translators never see <literal>Liberty Installation and Formatting Tome</literal>. Instead they see <literal>&LIFT;</literal>, which they cannot translate.
</para>
<para>
Consider something as simple as the following English sentence fragment being translated into a related language: German.
</para>
<blockquote>
<para>
As noted in the <citetitle>Liberty Installation and Formatting Tome</citetitle>, Chapter 3…
</para>
</blockquote>
<para>
A translation of this might be as follows:
</para>
<blockquote>
<para>
<foreignphrase>Wie in dem <citetitle>Wälzer für die Installation und Formatierung von Liberty</citetitle>, Kapitel 3, erwähnt…</foreignphrase>
</para>
</blockquote>
<para>
Because there is no text missing, the title can be translated into elegant German. Also, note the use of <foreignphrase>‘dem’</foreignphrase>, the correct form of the definite article ('the') when referring to a <foreignphrase>Wälzer</foreignphrase> ('tome') in this grammatical context.
</para>
<para>
By contrast, if entities are used, the entry in the PO file says:
</para>
<programlisting>
#. Tag: para
#, no-c-format
msgid "As noted in the <citetitle>&LIFT;</citetitle>, Chapter 3…"
msgstr ""
</programlisting>
<para>
The translation of this would probably run thus:
</para>
<programlisting>
#. Tag: para
#, no-c-format
msgid "As noted in the <citetitle>&LIFT;</citetitle>, Chapter 3…"
msgstr "Wie in <citetitle>&LIFT;</citetitle>, Kapitel 3, erwähnt…"
</programlisting>
<para>
And the presentation would be thus:
</para>
<blockquote>
<para>
<foreignphrase>Wie in <citetitle>Liberty Installation and Formatting Tome</citetitle>, Kapitel 3, erwähnt…</foreignphrase>
</para>
</blockquote>
<para>
This, of course, leaves the title in English, including words like 'Tome' and 'Formatting' that readers are unlikely to understand. Also, the translator is forced to omit the definite article ‘dem’, a more general construction that comes close to a hybrid of English and German that German speakers call <foreignphrase>Denglisch</foreignphrase> or <foreignphrase>Angleutsch</foreignphrase>. Many German speakers consider this approach incorrect and almost all consider it inelegant.
</para>
<para>
Equivalent problems emerge with a fragment such as this:
</para>
<blockquote>
<para>
However, a careful reading of the <citetitle>Liberty Installation and Formatting Tome</citetitle> afterword shows that…
</para>
</blockquote>
<para>
With no text hidden behind an entity, a German translation of this might be:
</para>
<blockquote>
<para>
<foreignphrase>Jedoch ergibt ein sorgfältiges Lesen des Nachworts für den <citetitle>Wälzer für die Installation und Formatierung von Liberty</citetitle>, dass…</foreignphrase>
</para>
</blockquote>
<para>
If an entity was used to save the writer time, the translator has to deal with this:
</para>
<programlisting>
#. Tag: para
#, no-c-format
msgid "However, a careful reading of the <citetitle>&LIFT;</citetitle> afterword shows that…"
msgstr ""
</programlisting>
<para>
And the translation would be subtly but importantly different:
</para>
<programlisting>
#. Tag: para
#, no-c-format
msgid "However, a careful reading of the <citetitle>&LIFT;</citetitle> afterword shows that…"
msgstr "Jedoch ergibt ein sorgfältiges Lesen des Nachworts für <citetitle>&LIFT;</citetitle>, dass…"
</programlisting>
<para>
When presented to a reader, this would appear as follows:
</para>
<blockquote>
<para>
<foreignphrase>Jedoch ergibt ein sorgfältiges Lesen des Nachworts für <citetitle>Liberty Installation and Formatting Tome</citetitle>, dass…</foreignphrase>
</para>
</blockquote>
<para>
Again, note the missing definite article (<foreignphrase>den</foreignphrase> in this grammatical context). This is inelegant but necessary since the translator can otherwise only guess which form of the definite article (<foreignphrase>den</foreignphrase>, <foreignphrase>die</foreignphrase> or <foreignphrase>das</foreignphrase>) to use, which would inevitably lead to error.
</para>
<para>
Finally, consider that although a particular word never changes its form in English, this is not necessarily true of other languages, even when the word is a <firstterm>proper noun</firstterm> such as the name of a product. In many languages, nouns change (<firstterm>inflect</firstterm>) their form according to their role in a sentence (their grammatical <firstterm>case</firstterm>). An XML entity set to represent an English noun or noun phrase therefore makes correct translation impossible in such languages.
</para>
<para>
For example, if you write a document that could apply to more than one product, you might be tempted to set an entity such as <sgmltag>&PRODUCT;</sgmltag>. The advantage of this approach is that by simply changing this value in the <filename><replaceable>Doc_Name</replaceable>.ent</filename> file, you could easily adjust the book to document (for example) Red Hat Enterprise Linux, Fedora, or CentOS. However, while the proper noun <literal>Fedora</literal> never varies in English, it has multiple forms in other languages. For example, in Czech the name <literal>Fedora</literal> has six different forms, depending on one of seven ways in which you can use it in a sentence:
</para>
<table frame="all" id="tabl-Users_Guide-Entities_and_translation-Fedora_in_Czech">
<title>'Fedora' in Czech</title>
<tgroup align="left" cols="3" colsep="1" rowsep="1">
<colspec colname="c1"></colspec>
<colspec colname="c2"></colspec>
<colspec colname="c3"></colspec>
<thead>
<row>
<entry>
Case
</entry>
<entry>
Usage
</entry>
<entry>
Form
</entry>
</row>
</thead>
<tbody>
<row>
<entry>
Nominative
</entry>
<entry>
the subject of a sentence
</entry>
<entry>
<foreignphrase>Fedora</foreignphrase>
</entry>
</row>
<row>
<entry>
Genitive
</entry>
<entry>
indicates possession
</entry>
<entry>
<foreignphrase>Fedory</foreignphrase>
</entry>
</row>
<row>
<entry>
Accusative
</entry>
<entry>
the direct object of a sentence
</entry>
<entry>
<foreignphrase>Fedoru</foreignphrase>
</entry>
</row>
<row>
<entry>
Dative
</entry>
<entry>
the indirect object of a sentence
</entry>
<entry>
<foreignphrase>Fedoře</foreignphrase>
</entry>
</row>
<row>
<entry>
Vocative
</entry>
<entry>
the subject of direct address
</entry>
<entry>
<foreignphrase>Fedoro</foreignphrase>
</entry>
</row>
<row>
<entry>
Locative
</entry>
<entry>
relates to a location
</entry>
<entry>
<foreignphrase>Fedoře</foreignphrase>
</entry>
</row>
<row>
<entry>
Instrumental
</entry>
<entry>
relates to a method
</entry>
<entry>
<foreignphrase>Fedorou</foreignphrase>
</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
For example:
</para>
<itemizedlist>
<listitem>
<para>
<foreignphrase>Fedora je linuxová distribuce.</foreignphrase> — Fedora is a Linux distribution.
</para>
</listitem>
<listitem>
<para>
<foreignphrase>Inštalácia Fedory</foreignphrase> — Installation of Fedora
</para>
</listitem>
<listitem>
<para>
<foreignphrase>Stáhnout Fedoru</foreignphrase> — Get Fedora
</para>
</listitem>
<listitem>
<para>
<foreignphrase>Přispějte Fedoře</foreignphrase> — Contribute to Fedora
</para>
</listitem>
<listitem>
<para>
<foreignphrase>Ahoj, Fedoro!</foreignphrase> — Hello Fedora!
</para>
</listitem>
<listitem>
<para>
<foreignphrase>Ve Fedoře 10…</foreignphrase> — In Fedora 10…
</para>
</listitem>
<listitem>
<para>
<foreignphrase>S Fedorou získáváte nejnovější…</foreignphrase> — With Fedora, you get the latest…
</para>
</listitem>
</itemizedlist>
<para>
A sentence that begins <foreignphrase>S Fedora získáváte nejnovější…</foreignphrase> remains comprehensible to Czech readers, but the result is not grammatically correct. The same effect can be simulated in English, because although English nouns lost their case endings during the Middle Ages, English pronouns are still inflected. The sentence, 'Me see she' is completely comprehensible to English speakers, but is not what they expect to read, because the form of the pronouns <literal>me</literal> and <literal>she</literal> is not correct. <literal>Me</literal> is the accusative form of the pronoun, but because it is the subject of the sentence, the pronoun should take the nominative form, <literal>I</literal>. Similarly, <literal>she</literal> is nominative case, but as the direct object of the sentence the pronoun should take its accusative form, <literal>her</literal>.
</para>
<para>
Nouns in most Slavic languages like Russian, Ukrainian, Czech, Polish, Serbian, and Croatian have seven different cases. Nouns in Finno–Ugaric languages such as Finnish, Hungarian, and Estonian have between fifteen and seventeen cases. Other languages alter nouns for other reasons. For example, Scandinavian languages inflect nouns to indicate <firstterm>definiteness</firstterm> — whether the noun refers to '<emphasis>a</emphasis> thing' or '<emphasis>the</emphasis> thing' — and some dialects of those languages inflect nouns both for definiteness <emphasis>and</emphasis> for grammatical case.
</para>
<para>
Now multiply such problems by the more than 40 languages that <application>Publican</application> currently supports. Other than the few non-translated strings that <application>Publican</application> specifies by default in the <filename><replaceable>Doc_Name</replaceable>.ent</filename> file, entities might prove useful for version numbers of products. Beyond that, the use of entities is tantamount to a conscious effort to inhibit and reduce the quality of translations. Furthermore, readers of your document in a language that inflects nouns (whether for case, definiteness, or other reasons) will not know that the bad grammar is the result of XML entities that you set — they will probably assume that the translator is incompetent.
</para>
</section>
|