1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171
|
<!doctype debiandoc system>
<debiandoc>
<book>
<titlepag>
<title>SGML Entity Management</title>
<author>
<name>Manoj Srivastava</name>
<email>srivasta@debian.org</email>
</author>
<version><date></version>
<abstract>
This document provides a guidelines for implementation
dependent Entity management of SGML entities for Debian
systems. This defines the mapping of <em>External Public
Identifiers</em> to <em>System Identifiers</em>. (In other
words, this document covers the use of <tt>/usr/lib/sgml</tt>,
entity naming, and catalog files.
</abstract>
<copyright>
<copyrightsummary>Copyright ©1998 Manoj
Srivastava
</copyrightsummary>
<p>
This manual is free software; you may redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version
2, or (at your option) any later version.
</p>
</copyright>
</titlepag>
<toc detail="sect">
<chapt id="intro">
<heading>Introduction and Scope</heading>
<p>
This document was written by Manoj Srivastava
<email>srivasta@debian.org</email> with contributions from Mark
W. Eichin <email>eichin@kitten.gen.ma.us</email> and Adam P. Harris
<email>aph@debian.org</email>. This document is part of the
<prgn>sgml-base</prgn> package.
</p>
<p>
This guideline is intended to be intepreted as SGML sub-policy (not
official policy). While this document does not carry the weight of
official policy, it is sufficient basis for the submission of bugs
against a package. This may change at a latter date.
</p>
<p>
Entity Management is generally left up to the implementation,
and hence there are no extant Standards that cover
this. However, lack of an established convention would prevent
different segments of the SGML subsystem from co-operating
with each other, hence it is important that a policy be
established so that SGML package maintainers may depend on
other parts of the system behaving consistently.
</p>
</chapt>
<chapt id="mapping">
<heading>Proposed mapping of public identifiers to system
identifiers</heading>
<p>
SGML can refer to an external file (really an entity) with an
<var>external identifier</var>: this is a <var>public identifier</var>
or a <var>system identifier</var>, or both.</p>
<p>
A typical public identifier looks like
<example>
PUBLIC "ISO 8879-1986//ENTITIES Added Latin 1//EN"
</example>
where <tt>ISO 8879-1986</tt> is the owner, <tt>ENTITIES</tt> is the
text class and <tt>Added Latin 1</tt> is the text description, and
<tt>EN</tt> is language.
</p>
<p>
A system identifier looks like
<example>
SYSTEM "htmlplus.dtd"
</example>
where <tt>htmlplus.dtd</tt> is a system-specific identifier.
</p>
<p>
To map external identifiers to file names, one should first try the
system identifier, as a file name, and then search entity catalog
files and then search the list of file names derived from the public
identifier. The catalog format is according to SGML/Opens resolution
on entity management. The catalog consists of a series of entries and
comments. A comment is delimited by <tt>--</tt> like in a markup
declaration.
</p>
<p>
The fallback derivation of the file name is modelled after the
<prgn>sgmls</prgn> environment variable <var>SGML_PATH</var>,
and Emacs <prgn>psgml</prgn> mode's <var>sgml-public-map</var>
variable. There does not seem to be any official standards
(this is left to the implementation), so this standard is
simply an abstraction of real-world practice of SGML
tools, and shall now be the standard for Debian systems, since
this is the convention currently followed by all applications
currently in Debian.</p>
<p>
Contiguous white space is compacted to a single space and
replaced with an underscore (<tt>_</tt>); the characters
<tt>/</tt> to <tt>%</tt> are also replaced with <tt>_</tt>.
The text class is down-cased. The language specifier (i.e.,
<tt>//EN</tt>) and anything following it should be
removed.
</p>
</chapt>
<chapt id="Other">
<heading>Location of miscellaneous files</heading>
<p>
There are a number of other files, though not entities
referenced by Document instances, are still required by the
SGML subsytem to parse or validate the document. These files
are also covered by this document.
<enumlist>
<item>
<p>Declaration Files: Any declaration file should be put
in <tt>/usr/lib/sgml/declaration</tt></p>
</item>
<item>
<p>
Notations: These files go in
<tt>/usr/lib/sgml/notation</tt>.
</p>
</item>
</enumlist>
</p>
</chapt>
<chapt id="examples"><heading>Examples</heading>
<p>
A few public and system identifiers pairings are shown below.
<example>
PUBLIC "ISO 8879-1986//ENTITIES Added Latin 1//EN"
/usr/lib/sgml/ISO_8879-1986/entities/Added_Latin_1
"ISO 8879-1986//ENTITIES Added Math Symbols: Arrow Relations//EN"
/usr/lib/sgml/ISO_8879-1986/entities/Added_Math_Symbols:_Arrow_Relations
"-//IETF//DTD HTML Level 3//EN//3.0"
/usr/lib/sgml/IETF/dtd/HTML_Level_3.0
"-//IETF//DTD HTML Strict Level 3//EN"
/usr/lib/sgml/IETF/dtd/HTML_Strict_Level_3
"-//USA-DOD//DTD Table Model 951010//EN"
/usr/lib/sgml/USA-DOD/dtd/Table_Model_951010
</example>
</p>
<p>The first four actually exist in Debian.</p>
</chapt>
</book>
</debiandoc>
|