File: Entities.xml

package info (click to toggle)
publican 4.2.6-1
  • links: PTS, VCS
  • area: main
  • in suites: jessie, jessie-kfreebsd
  • size: 15,584 kB
  • ctags: 450
  • sloc: perl: 12,896; xml: 11,510; makefile: 45; sh: 6
file content (260 lines) | stat: -rw-r--r-- 13,049 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
<?xml version='1.0' encoding='utf-8' ?>
<!DOCTYPE section [
<!ENTITY % BOOK_ENTITIES SYSTEM "Users_Guide.ent">
%BOOK_ENTITIES;
<!ENTITY % sgml.features "IGNORE">
<!ENTITY % xml.features "INCLUDE">
<!ENTITY % DOCBOOK_ENTS PUBLIC "-//OASIS//ENTITIES DocBook Character Entities V4.5//EN" "/usr/share/xml/docbook/schema/dtd/4.5/dbcentx.mod">
%DOCBOOK_ENTS;
]>
<section conformance="154" version="5.0" xml:id="sect-Publican-Users_Guide-Entities_and_translation" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink">
	<info xml:id="info-Publican-Users_Guide-Entities_and_translation">
		<title>Entities and translation</title>

	</info>
	 <warning>
		<info xml:id="info-Publican-Users_Guide-Use_entities_with_extreme_caution">
			<title>Use entities with extreme caution</title>

		</info>
		 <para>
			Entities offer convenience but they should be used with extreme caution in documents that will be translated. Writing (for example) <tag>&amp;FDS;</tag> instead of <application>Fedora Directory Server</application> saves the writer time but transformed entities do not appear in the <firstterm>portable object</firstterm> (PO) files that translators use. Complete translations of documents containing entities are, as a consequence, impossible.
		</para>

	</warning>
	 <para>
		Entities present special obstacles to translators and can preclude the production of high-quality translations. The very nature of an entity is that the word or phrase represented by the entity is rendered exactly the same way every time that it occurs in the document, in every language. This inflexibility means that the word or word group represented by the entity might be illegible or incomprehensible in the target language and that the word or word group represented by the entity cannot change when the grammatical rules of the target language require them to change. Furthermore, because entities are not transformed when XML is converted to PO, translators cannot select the correct words that surround the entity, as required by the grammatical rules of the target language.
	</para>
	 <para>
		If you define an entity — <tag>&lt;!ENTITY LIFT "Liberty Installation and Formatting Tome"&gt;</tag> — you can enter <literal>&amp;LIFT;</literal> in your XML and it will appear as <literal>Liberty Installation and Formatting Tome</literal> every time the book is built as HTML, PDF or text.
	</para>
	 <para>
		Entities are not transformed when XML is converted to PO, however. Consequently, translators never see <literal>Liberty Installation and Formatting Tome</literal>. Instead they see <literal>&amp;LIFT;</literal>, which they cannot translate.
	</para>
	 <para>
		Consider something as simple as the following English sentence fragment being translated into a related language: German.
	</para>
	 <blockquote>
		<para>
			As noted in the <citetitle>Liberty Installation and Formatting Tome</citetitle>, Chapter 3…
		</para>

	</blockquote>
	 <para>
		A translation of this might be as follows:
	</para>
	 <blockquote>
		<para>
			<foreignphrase>Wie in dem <citetitle>Wälzer für die Installation und Formatierung von Liberty</citetitle>, Kapitel 3, erwähnt…</foreignphrase>
		</para>

	</blockquote>
	 <para>
		Because there is no text missing, the title can be translated into elegant German. Also, note the use of <foreignphrase>‘dem’</foreignphrase>, the correct form of the definite article ('the') when referring to a <foreignphrase>Wälzer</foreignphrase> ('tome') in this grammatical context.
	</para>
	 <para>
		By contrast, if entities are used, the entry in the PO file says:
	</para>
	 
<programlisting>#. Tag: para
#, no-c-format
msgid "As noted in the &lt;citetitle&gt;&amp;LIFT;&lt;/citetitle&gt;, Chapter 3…"
msgstr ""</programlisting>
	 <para>
		The translation of this would probably run thus:
	</para>
	 
<programlisting>#. Tag: para
#, no-c-format
msgid "As noted in the &lt;citetitle&gt;&amp;LIFT;&lt;/citetitle&gt;, Chapter 3…"
msgstr "Wie in &lt;citetitle&gt;&amp;LIFT;&lt;/citetitle&gt;, Kapitel 3, erwähnt…"</programlisting>
	 <para>
		And the presentation would be thus:
	</para>
	 <blockquote>
		<para>
			<foreignphrase>Wie in <citetitle>Liberty Installation and Formatting Tome</citetitle>, Kapitel 3, erwähnt…</foreignphrase>
		</para>

	</blockquote>
	 <para>
		This, of course, leaves the title in English, including words like 'Tome' and 'Formatting' that readers are unlikely to understand. Also, the translator is forced to omit the definite article ‘dem’, a more general construction that comes close to a hybrid of English and German that German speakers call <foreignphrase>Denglisch</foreignphrase> or <foreignphrase>Angleutsch</foreignphrase>. Many German speakers consider this approach incorrect and almost all consider it inelegant.
	</para>
	 <para>
		Equivalent problems emerge with a fragment such as this:
	</para>
	 <blockquote>
		<para>
			However, a careful reading of the <citetitle>Liberty Installation and Formatting Tome</citetitle> afterword shows that…
		</para>

	</blockquote>
	 <para>
		With no text hidden behind an entity, a German translation of this might be:
	</para>
	 <blockquote>
		<para>
			<foreignphrase>Jedoch ergibt ein sorgfältiges Lesen des Nachworts für den <citetitle>Wälzer für die Installation und Formatierung von Liberty</citetitle>, dass…</foreignphrase>
		</para>

	</blockquote>
	 <para>
		If an entity was used to save the writer time, the translator has to deal with this:
	</para>
	 
<programlisting>#. Tag: para
#, no-c-format
msgid "However, a careful reading of the &lt;citetitle&gt;&amp;LIFT;&lt;/citetitle&gt; afterword shows that…"
msgstr ""</programlisting>
	 <para>
		And the translation would be subtly but importantly different:
	</para>
	 
<programlisting>#. Tag: para
#, no-c-format
msgid "However, a careful reading of the &lt;citetitle&gt;&amp;LIFT;&lt;/citetitle&gt; afterword shows that…"
msgstr "Jedoch ergibt ein sorgfältiges Lesen des Nachworts für &lt;citetitle&gt;&amp;LIFT;&lt;/citetitle&gt;, dass…"</programlisting>
	 <para>
		When presented to a reader, this would appear as follows:
	</para>
	 <blockquote>
		<para>
			<foreignphrase>Jedoch ergibt ein sorgfältiges Lesen des Nachworts für <citetitle>Liberty Installation and Formatting Tome</citetitle>, dass…</foreignphrase>
		</para>

	</blockquote>
	 <para>
		Again, note the missing definite article (<foreignphrase>den</foreignphrase> in this grammatical context). This is inelegant but necessary since the translator can otherwise only guess which form of the definite article (<foreignphrase>den</foreignphrase>, <foreignphrase>die</foreignphrase> or <foreignphrase>das</foreignphrase>) to use, which would inevitably lead to error.
	</para>
	 <para>
		Finally, consider that although a particular word never changes its form in English, this is not necessarily true of other languages, even when the word is a <firstterm>proper noun</firstterm> such as the name of a product. In many languages, nouns change (<firstterm>inflect</firstterm>) their form according to their role in a sentence (their grammatical <firstterm>case</firstterm>). An XML entity set to represent an English noun or noun phrase therefore makes correct translation impossible in such languages.
	</para>
	 <para>
		For example, if you write a document that could apply to more than one product, you might be tempted to set an entity such as <tag>&amp;PRODUCT;</tag>. The advantage of this approach is that by simply changing this value in the <filename><replaceable>Doc_Name</replaceable>.ent</filename> file, you could easily adjust the book to document (for example) Red Hat Enterprise Linux, Fedora, or CentOS. However, while the proper noun <literal>Fedora</literal> never varies in English, it has multiple forms in other languages. For example, in Czech the name <literal>Fedora</literal> has six different forms, depending on one of seven ways in which you can use it in a sentence:
	</para>
	 <table conformance="155" frame="all" xml:id="tabl-Publican-Users_Guide-Entities_and_translation-Fedora_in_Czech">
		<info xml:id="info-Publican-Users_Guide-Fedora_in_Czech">
			<title>'Fedora' in Czech</title>

		</info>
		 <tgroup align="left" cols="3" colsep="1" rowsep="1">
			<colspec colname="c1"></colspec>
			 <colspec colname="c2"></colspec>
			 <colspec colname="c3"></colspec>
			 <thead>
				<row>
					<entry> Case </entry>
					 <entry> Usage </entry>
					 <entry> Form </entry>

				</row>

			</thead>
			 <tbody>
				<row>
					<entry> Nominative </entry>
					 <entry> the subject of a sentence </entry>
					 <entry> <foreignphrase>Fedora</foreignphrase> </entry>

				</row>
				 <row>
					<entry> Genitive </entry>
					 <entry> indicates possession </entry>
					 <entry> <foreignphrase>Fedory</foreignphrase> </entry>

				</row>
				 <row>
					<entry> Accusative </entry>
					 <entry> the direct object of a sentence </entry>
					 <entry> <foreignphrase>Fedoru</foreignphrase> </entry>

				</row>
				 <row>
					<entry> Dative </entry>
					 <entry> the indirect object of a sentence </entry>
					 <entry> <foreignphrase>Fedoře</foreignphrase> </entry>

				</row>
				 <row>
					<entry> Vocative </entry>
					 <entry> the subject of direct address </entry>
					 <entry> <foreignphrase>Fedoro</foreignphrase> </entry>

				</row>
				 <row>
					<entry> Locative </entry>
					 <entry> relates to a location </entry>
					 <entry> <foreignphrase>Fedoře</foreignphrase> </entry>

				</row>
				 <row>
					<entry> Instrumental </entry>
					 <entry> relates to a method </entry>
					 <entry> <foreignphrase>Fedorou</foreignphrase> </entry>

				</row>

			</tbody>

		</tgroup>

	</table>
	 <para>
		For example:
	</para>
	 <itemizedlist>
		<listitem>
			<para>
				<foreignphrase>Fedora je linuxová distribuce.</foreignphrase> — Fedora is a Linux distribution.
			</para>

		</listitem>
		 <listitem>
			<para>
				<foreignphrase>Inštalácia Fedory</foreignphrase> — Installation of Fedora
			</para>

		</listitem>
		 <listitem>
			<para>
				<foreignphrase>Stáhnout Fedoru</foreignphrase> — Get Fedora
			</para>

		</listitem>
		 <listitem>
			<para>
				<foreignphrase>Přispějte Fedoře</foreignphrase> — Contribute to Fedora
			</para>

		</listitem>
		 <listitem>
			<para>
				<foreignphrase>Ahoj, Fedoro!</foreignphrase> — Hello Fedora!
			</para>

		</listitem>
		 <listitem>
			<para>
				<foreignphrase>Ve Fedoře 10…</foreignphrase> — In Fedora 10…
			</para>

		</listitem>
		 <listitem>
			<para>
				<foreignphrase>S Fedorou získáváte nejnovější…</foreignphrase> — With Fedora, you get the latest…
			</para>

		</listitem>

	</itemizedlist>
	 <para>
		A sentence that begins <foreignphrase>S Fedora získáváte nejnovější…</foreignphrase> remains comprehensible to Czech readers, but the result is not grammatically correct. The same effect can be simulated in English, because although English nouns lost their case endings during the Middle Ages, English pronouns are still inflected. The sentence, 'Me see she' is completely comprehensible to English speakers, but is not what they expect to read, because the form of the pronouns <literal>me</literal> and <literal>she</literal> is not correct. <literal>Me</literal> is the accusative form of the pronoun, but because it is the subject of the sentence, the pronoun should take the nominative form, <literal>I</literal>. Similarly, <literal>she</literal> is nominative case, but as the direct object of the sentence the pronoun should take its accusative form, <literal>her</literal>.
	</para>
	 <para>
		Nouns in most Slavic languages like Russian, Ukrainian, Czech, Polish, Serbian, and Croatian have seven different cases. Nouns in Finno–Ugaric languages such as Finnish, Hungarian, and Estonian have between fifteen and seventeen cases. Other languages alter nouns for other reasons. For example, Scandinavian languages inflect nouns to indicate <firstterm>definiteness</firstterm> — whether the noun refers to '<emphasis>a</emphasis> thing' or '<emphasis>the</emphasis> thing' — and some dialects of those languages inflect nouns both for definiteness <emphasis>and</emphasis> for grammatical case.
	</para>
	 <para>
		Now multiply such problems by the more than 40 languages that <application>Publican</application> currently supports. Other than the few non-translated strings that <application>Publican</application> specifies by default in the <filename><replaceable>Doc_Name</replaceable>.ent</filename> file, entities might prove useful for version numbers of products. Beyond that, the use of entities is tantamount to a conscious effort to inhibit and reduce the quality of translations. Furthermore, readers of your document in a language that inflects nouns (whether for case, definiteness, or other reasons) will not know that the bad grammar is the result of XML entities that you set — they will probably assume that the translator is incompetent.
	</para>
</section>