File: xni-xerces2.xml

package info (click to toggle)
libxerces2-java 2.12.0-1
  • links: PTS, VCS
  • area: main
  • in suites: buster
  • size: 12,608 kB
  • sloc: java: 127,779; xml: 13,596; sh: 39; makefile: 10
file content (533 lines) | stat: -rw-r--r-- 20,005 bytes parent folder | download | duplicates (5)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
<?xml version='1.0' encoding='UTF-8'?>
<!--
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 * 
 *      http://www.apache.org/licenses/LICENSE-2.0
 * 
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
-->
<!DOCTYPE s1 SYSTEM 'dtd/document.dtd'>
<s1 title='Xerces2 Components'>
 <s2 title='Re-using Xerces2 Parser Components'>
  <p>
   The Xerces Native Interface (XNI) defines a general way to build
   parser components and configurations. The Xerces2 reference
   implementation is written using this framework so that the parser
   components can be re-used to create new parser configurations. In
   order to re-use the Xerces2 parser components, however, the
   developer must know the dependencies of each standard component.
   This document provides an overview of the Xerces2 parser components
   and lists the relevant dependencies.
  </p>
  <p>
   An overview of the general dependencies and the dependencies for
   each standard component are detailed below:
  </p>
  <ul>
   <li><link anchor='overview'>Overview</link></li>
   <li>Fundamental Dependencies</li>
   <ul>
    <li><link anchor='symbol-table'>Symbol Table</link></li>
    <li><link anchor='error-reporter'>Error Reporter</link></li>
   </ul>
   <li>Individual Component Dependencies</li>
   <ul>
    <li><link anchor='document-scanner'>Document Scanner</link></li>
    <li><link anchor='dtd-scanner'>DTD Scanner</link></li>
    <li><link anchor='entity-manager'>Entity Manager</link></li>
    <li><link anchor='dtd-validator'>DTD Validator</link></li>
    <li><link anchor='namespace-binder'>Namespace Binder</link></li>
    <li><link anchor='schema-validator'>Schema Validator</link></li>
   </ul>
   <!-- NOTE:
     - More components and their dependencies may be added as
     - they are made and integrated into the standard Xerces2
     - configuration.
     -->
  </ul>
 </s2>
 <anchor name='overview'/>
 <s2 title='Overview'>
  <p>
   The standard parser configuration for the Xerces2 reference
   implementation of XNI is defined in the
   <code>org.apache.xerces.parsers.StandardParserConfiguration</code>
   class. This configuration is comprised of a number of 
   components. Some of these components are configurable and some 
   are shared within the configuration but do not implement the
   <link idref='xni-config' anchor='component'>XMLComponent</link>
   interface.
  </p>
  <p>
   The following list details the set of components used in the 
   Xerces2 standard configuration. The components marked with an
   asterisk (*) are configurable.
  </p>
  <ul>
   <li>Symbol Table</li>
   <li>Error Reporter (*)</li>
   <li>Document Scanner (*)</li>
   <li>DTD Scanner (*)</li>
   <li>Entity Manager (*)</li>
   <li>DTD Validator (*)</li>
   <li>Namespace Binder (*)</li>
   <li>Schema Validator (*)</li>
  </ul>
  <note>
   There are additional components other than those in the above
   list, such as the "Grammar Pool" and "Datatype Validator
   Factory". However, the validation engine in the Xerces2
   reference implementation is currently being re-designed and
   re-implemented. Therefore, these components are subject to
   change and should not be used or relied upon.
  </note>
  <p>
   In general, there are levels of dependency among the components
   in the standard configuration. Some components are required by
   all of the configurable components, where as there are certain
   components required by other components. The following diagram
   illustrates these basic levels of dependency.
  </p>
  <p>
   <img alt='Xerces2 Component Dependencies' src='xni-components-dependence.gif'/>
  </p>
  <p>
   The dependencies of each component are detailed in subsequent
   sections of this document but the basic dependencies are 
   listed below:
  </p>
  <ul>
   <li>
    All of the configurable Xerces2 components in the standard 
    configuration depend on the 
    "<link anchor='symbol-table'>Symbol Table</link>" and the
    "<link anchor='error-reporter'>Error Reporter</link>".
   </li>
   <li>
    Both the "<link anchor='document-scanner'>Document Scanner</link>"
    and the "<link anchor='dtd-scanner'>DTD Scanner</link>" depend 
    on the "<link anchor='entity-manager'>Entity Manager</link>"
    component.
   </li>
   <li>
    In addition to the other dependencies, the 
    "<link anchor='document-scanner'>Document Scanner</link>" also
    depends on the 
    "<link anchor='dtd-scanner'>DTD Scanner</link>" for scanning of
    the internal and external subsets of the DTD.
   </li>
  </ul>
  <p>
   Each configurable component queries the components that it
   depends on before each document is parsed. The configuration 
   is required to call the 
   <link idref='xni-config' anchor='component'>XMLComponent</link>'s 
   "reset" method. From the 
   <link idref='xni-config' anchor='component-manager'>XMLComponentManager</link>
   object that is passed to the "reset" method, the component
   can query the other components that it needs. Therefore, each
   component is assigned a unique property identifier used to
   query the components from the component manager.
  </p>
  <p>
   The following example source code shows how one of the
   standard Xerces2 components is queried within a configurable
   component. However, for complete dependency details and the 
   property identifiers defined for each component, refer to the
   appropriate sections of this document.
  </p>
  <source><![CDATA[import org.apache.xerces.xni.parser.XMLComponent;
import org.apache.xerces.xni.parser.XMLComponentManager;
import org.apache.xerces.xni.parser.XMLConfigurationException;

public class MyComponent
    implements XMLComponent {
    
    // Constants
    
    public static final String SYMBOL_TABLE =
        "http://apache.org/xml/properties/internal/symbol-table";

    // XMLComponent methods
    
    public void reset(XMLComponentManager manager)
        throws XMLConfigurationException {
        SymbolTable symbolTable = 
	    (SymbolTable)manager.getProperty(SYMBOL_TABLE);
    }

}]]></source>
 </s2>
 <anchor name='symbol-table'/>
 <s2 title='Symbol Table'>
  <p>Property information:</p>
  <table>
   <tr>
    <th>Property Id</th>
    <td>http://apache.org/xml/properties/internal/symbol-table</td>
   </tr>
   <tr>
    <th>Type</th>
    <td>org.apache.xerces.util.SymbolTable</td>
   </tr>
  </table>
  <p>
   For performance reasons, the Xerces2 reference implementation
   uses a custom symbol table in order to re-use common strings
   that appear in the document. The symbol table is responsible
   for keeping track of these common strings and always return
   the same <code>java.lang.String</code> reference for lexically
   equivalent strings. This not only reduces the amount of unique
   objects created while parsing, it also allows components (e.g.
   the validators, etc) to perform comparisons directly on the
   references for certain string objects without having to call
   the "equals" method.
  </p>
  <p>
   <strong>Note:</strong> 
   Nearly all of the standard components depend on this component.
   Therefore, if you write a parser configuration that re-uses any
   of the standard components, you must have an instance of this
   component registered with the appropriate property identifier.
  </p>
 </s2>
 <anchor name='error-reporter'/>
 <s2 title='Error Reporter'>
  <p>Property information:</p>
  <table>
   <tr>
    <th>Property Id</th>
    <td>http://apache.org/xml/properties/internal/error-reporter</td>
   </tr>
   <tr>
    <th>Type</th>
    <td>org.apache.xerces.impl.XMLErrorReporter</td>
   </tr>
  </table>
  <p>Recognized features:</p>
  <ul>
   <li>http://apache.org/xml/features/continue-after-fatal-error</li>
  </ul>
  <p>Recognized properties:</p>
  <ul>
   <li>http://apache.org/xml/properties/internal/error-handler</li>
  </ul>
  <p>
   In any parser instance, there must be a way for components to
   report errors in a uniform way. The "Error Reporter" component
   serves this purpose and simplifies the process of localizing
   error messages and notifying the registered
   <link idref='xni-parser' anchor='error-handler'>XMLErrorHandler</link>.
  </p>
  <p>
   In general, errors are identified by the domain of the error 
   and a unique key within that domain. The XMLErrorReporter class
   allows message formatters to be set for each domain and then
   delegates the formatting of error messages (with replacement
   text) to the message formatter assigned to that error domain.
   The localized error message is then sent to the registered
   error handler.
  </p>
  <p>
   An error message formatter is any class that implements the
   <code>org.apache.xerces.util.MessageFormatter</code> interface.
   If you write a new parser component for use with the existing
   Xerces2 components, you should implement your own message 
   formatter and register it with the Error Reporter. For
   example:
  </p>
  <source><![CDATA[import java.util.Locale;
import java.util.MissingResourceException;
import org.apache.xerces.util.MessageFormatter;

public class MyFormatter
    implements MessageFormatter {

    // MessageFormatter methods
    
    public String formatMessage(Locale locale, String key, Object[] args)
        throws MissingResourceException {
        // localize and format message based on locale, key, 
	// and replacement text arguments
	return "MY ERROR ("+key+")";
    }

}]]></source>
  <source><![CDATA[import org.apache.xerces.impl.XMLErrorReporter;
import org.apache.xerces.xni.parser.XMLComponent;
import org.apache.xerces.xni.parser.XMLComponentManager;
import org.apache.xerces.xni.parser.XMLConfigurationException;

public class MyComponent
    implements XMLComponent {
    
    // Constants
    
    public static final String ERROR_REPORTER =
        "http://apache.org/xml/properties/internal/error-reporter";

    public static final String DOMAIN = "http://example.com/mydomain";

    // XMLComponent methods
    
    public void reset(XMLComponentManager manager)
        throws XMLConfigurationException {
        XMLErrorReporter reporter = 
	    (XMLErrorReporter)manager.getProperty(ERROR_REPORTER);
	if (reporter.getMessageFormatter(DOMAIN) == null) {
	    reporter.putMesssageFormatter(DOMAIN, new MyFormatter());
	}
    }

}]]></source>
  <p>
   <strong>Note:</strong>
   It is <em>strongly</em> encouraged that any new error domains
   that you create follow the standard URI syntax. While there is
   no requirement that the URI must point to an actual resource on 
   the Internet, it is a common way to separate domains and it
   provides more useful information to the application. 
  </p>
  <p>
   <strong>Note:</strong> 
   Nearly all of the standard components depend on this component.
   Therefore, if you write a parser configuration that re-uses any
   of the standard components, you must have an instance of this
   component registered with the appropriate property identifier.
  </p>
 </s2>
 <anchor name='document-scanner'/>
 <s2 title='Document Scanner'>
  <p>Property information:</p>
  <table>
   <tr>
    <th>Property Id</th>
    <td>http://apache.org/xml/properties/internal/document-scanner</td>
   </tr>
   <tr>
    <th>Type</th>
    <td>org.apache.xerces.xni.parser.XMLDocumentScanner</td>
   </tr>
  </table>
  <p>Required properties:</p>
  <ul>
   <li>http://apache.org/xml/properties/internal/symbol-table</li>
   <li>http://apache.org/xml/properties/internal/error-reporter</li>
   <li>http://apache.org/xml/properties/internal/entity-manager</li>
   <li>http://apache.org/xml/properties/internal/dtd-scanner</li>
  </ul>
  <p>Recognized features:</p>
  <ul>
   <li>http://xml.org/sax/features/namespaces</li>
   <li>http://xml.org/sax/features/validation</li>
   <li>http://apache.org/xml/features/nonvalidating/load-external-dtd</li>
   <li>http://apache.org/xml/features/scanner/notify-char-refs</li>
   <li>http://apache.org/xml/features/scanner/notify-builtin-refs</li>
  </ul>
  <p>
   The <code>org.apache.xerces.impl.XMLDocumentScannerImpl</code>
   class implements the XNI document scanner interface and is
   implemented so that it can also function as a "pull" parser.
   A pull parser allows the application to drive the parsing of 
   the document instead of having all of the document events
   "pushed" to the registered handlers.
  </p>
 </s2>
 <anchor name='dtd-scanner'/>
 <s2 title='DTD Scanner'>
  <p>Property information:</p>
  <table>
   <tr>
    <th>Property Id</th>
    <td>http://apache.org/xml/properties/internal/dtd-scanner</td>
   </tr>
   <tr>
    <th>Type</th>
    <td>org.apache.xerces.xni.parser.XMLDTDScanner</td>
   </tr>
  </table>
  <p>Required properties:</p>
  <ul>
   <li>http://apache.org/xml/properties/internal/symbol-table</li>
   <li>http://apache.org/xml/properties/internal/error-reporter</li>
   <li>http://apache.org/xml/properties/internal/entity-manager</li>
  </ul>
  <p>Recognized features:</p>
  <ul>
   <li>http://xml.org/sax/features/validation</li>
   <li>http://apache.org/xml/features/scanner/notify-char-refs</li>
  </ul>
  <p>
   The <code>org.apache.xerces.impl.XMLDTDScannerImpl</code>
   class implements the XNI DTD scanner interface and is
   implemented so that it can also function as a "pull" parser.
   A pull parser allows the application to drive the parsing of 
   the DTD instead of having all of the DTD events
   "pushed" to the registered handlers.
  </p>
 </s2>
 <anchor name='entity-manager'/>
 <s2 title='Entity Manager'>
  <p>Property information:</p>
  <table>
   <tr>
    <th>Property Id</th>
    <td>http://apache.org/xml/properties/internal/entity-manager</td>
   </tr>
   <tr>
    <th>Type</th>
    <td>org.apache.xerces.impl.XMLEntityManager</td>
   </tr>
  </table>
  <p>Required properties:</p>
  <ul>
   <li>http://apache.org/xml/properties/internal/symbol-table</li>
   <li>http://apache.org/xml/properties/internal/error-reporter</li>
  </ul>
  <p>Recognized features:</p>
  <ul>
   <li>http://xml.org/sax/features/validation</li>
   <li>http://xml.org/sax/features/external-general-entities</li>
   <li>http://xml.org/sax/features/external-parameter-entities</li>
   <li>http://apache.org/xml/features/allow-java-encodings</li>
  </ul>
  <p>Recognized properties:</p>
  <ul>
   <li>http://apache.org/xml/properties/entity-resolver</li>
  </ul>
  <p>
   Both the <link anchor='document-scanner'>Document Scanner</link>
   and the <link anchor='dtd-scanner'>DTD Scanner</link> depend
   on the Entity Manager. This component handles starting and
   stopping entities automatically so that the scanners can continue
   operation transparently even when entities go in and out of
   scope.
  </p>
  <p>
   The Entity Manager implements an Entity Scanner which is a 
   low-level scanner for document and DTD information. Because
   the document and DTD scanners interact only with the Entity
   Scanner to scan the document, the scanners are shielded from
   changes caused by starting and stopping entities. Changes in
   the entities being scanned happens transparently within the 
   Manager/Scanner combination but the scanner components are
   notified of the start and end of the entity by implementing
   the XMLEntityHandler interface that is only part of the
   Xerces2 reference implementation.
  </p>
 </s2>
 <anchor name='dtd-validator'/>
 <s2 title='DTD Validator'>
  <p>Property information:</p>
  <table>
   <tr>
    <th>Property Id</th>
    <td>http://apache.org/xml/properties/internal/validator/dtd</td>
   </tr>
   <tr>
    <th>Type</th>
    <td>org.apache.xerces.impl.dtd.XMLDTDValidator</td>
   </tr>
  </table>
  <p>Required properties:</p>
  <ul>
   <li>http://apache.org/xml/properties/internal/symbol-table</li>
   <li>http://apache.org/xml/properties/internal/error-reporter</li>
   <!--
     - NOTE: The following properties are also required but the
     -       validation engine is being redesigned so I'm not
     -       listing them in the documentation.
   <li>http://apache.org/xml/properties/internal/grammar-pool</li>
   <li>http://apache.org/xml/properties/internal/datatype-validator-factory</li>
   -->
  </ul>
  <p>Recognized features:</p>
  <ul>
   <li>http://xml.org/sax/features/namespaces</li>
   <li>http://xml.org/sax/features/validation</li>
   <li>http://apache.org/xml/features/validation/dynamic</li>
  </ul>
  <p>
   The DTD Validator performs validation of the document events that 
   it receives which may augment the streaming information set with
   default attribute values and normalizing attribute values.
  </p>
 </s2>
 <anchor name='namespace-binder'/>
 <s2 title='Namespace Binder'>
  <p>Property information:</p>
  <table>
   <tr>
    <th>Property Id</th>
    <td>http://apache.org/xml/properties/internal/namespace-binder</td>
   </tr>
   <tr>
    <th>Type</th>
    <td>org.apache.xerces.impl.XMLNamespaceBinder</td>
   </tr>
  </table>
  <p>Required properties:</p>
  <ul>
   <li>http://apache.org/xml/properties/internal/symbol-table</li>
   <li>http://apache.org/xml/properties/internal/error-reporter</li>
  </ul>
  <p>Recognized features:</p>
  <ul>
   <li>http://xml.org/sax/features/namespaces</li>
  </ul>
  <p>
   The Namespace Binder is responsible for detecting namespace bindings
   in the startElement/emptyElement methods and emitting appropriate 
   start and end prefix mapping events. Namespace binding should always
   occur <em>after</em> DTD validation (since namespace bindings may
   have been defaulted from an attribute declaration in the DTD) and
   <em>before</em> Schema validation.
  </p>
 </s2>
 <anchor name='schema-validator'/>
 <s2 title='Schema Validator'>
  <p>Property information:</p>
  <table>
   <tr>
    <th>Property Id</th>
    <td>http://apache.org/xml/properties/internal/validator/schema</td>
   </tr>
   <tr>
    <th>Type</th>
    <td>org.apache.xerces.impl.xs.XMLSchemaValidator</td>
   </tr>
  </table>
  <p>Required properties:</p>
  <ul>
   <li>http://apache.org/xml/properties/internal/symbol-table</li>
   <li>http://apache.org/xml/properties/internal/error-reporter</li>
   <!--
     - NOTE: The following properties are also required but the
     -       validation engine is being redesigned so I'm not
     -       listing them in the documentation.
   <li>http://apache.org/xml/properties/internal/grammar-pool</li>
   <li>http://apache.org/xml/properties/internal/datatype-validator-factory</li>
   -->
  </ul>
  <p>Recognized features:</p>
  <ul>
   <li>http://xml.org/sax/features/namespaces</li>
   <li>http://xml.org/sax/features/validation</li>
   <li>http://apache.org/xml/features/validation/dynamic</li>
  </ul>
  <p>
   The Schema Validator performs validation of the document events that 
   it receives which may augment the streaming information set with
   default simple type values and normalizing simple type values.
  </p>
 </s2>
</s1>