File: table.html

package info (click to toggle)
swi-prolog 5.2.13-1
  • links: PTS
  • area: main
  • in suites: sarge
  • size: 55,032 kB
  • ctags: 29,741
  • sloc: ansic: 215,187; perl: 110,995; cpp: 7,687; sh: 3,235; makefile: 3,227; yacc: 843; xml: 31; awk: 14; sed: 12
file content (613 lines) | stat: -rw-r--r-- 24,508 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
<HTML>
<HEAD>
<TITLE>Managing external tables for SWI-Prolog</TITLE>
</HEAD>
<BODY BGCOLOR="white">
<BLOCKQUOTE>
<BLOCKQUOTE>
<BLOCKQUOTE>
<BLOCKQUOTE>
<CENTER>

<H1>Managing external tables for SWI-Prolog</H1>

</CENTER>
<HR>
<CENTER>
<I>Jan Wielemaker <BR>
SWI, <BR>
University of Amsterdam <BR>
The Netherlands <BR>
E-mail: <A HREF="mailto:jan@swi.psy.uva.nl">jan@swi.psy.uva.nl</A></I>
</CENTER>
<HR>
</BLOCKQUOTE>
</BLOCKQUOTE>
</BLOCKQUOTE>
</BLOCKQUOTE>
<CENTER><H3>Abstract</H3></Center>
<TABLE WIDTH="90%" ALIGN=center BORDER=2 BGCOLOR="#f0f0f0"><TR><TD>
This document describes a foreign language extension to
<A HREF="http://www.swi.psy.uva.nl/projects/SWI-Prolog">SWI-Prolog</A> 
for the manipulation of `external tables'. External tables are files 
using a textual representation of records separated into fields. The 
package allows for a flexible definition of the format of the file in 
terms of records and fields, how the information in the file should be 
mapped onto Prolog data types and what properties the file has to 
improve the performance of lookup.

<P>The table package has been used successfully to deal with large 
static databases such as dictionaries. Compared to loading the tables 
into the Prolog database, this approach required much less memory and 
loads much faster while providing reasonable lookup-performance on 
sorted tables.

<P>This package uses read-only `mapping' of the database file into 
memory and is ported to Win32 (Windows 95 and NT) as well as Unix 
systems providing the mmap() system call (Solaris, SunOs, Linux and many 
more modern Unices).
</TABLE>

<H1><A NAME="document-contents">Table of Contents</A></H1>

<UL>
<UL>
<LI><A HREF="#sec:1"><B>1 Introduction</B></A>
<LI><A HREF="#sec:2"><B>2 Managing external tables</B></A>
<UL>
<LI><A HREF="#sec:2.1">2.1 Creating and destroying tables</A>
<LI><A HREF="#sec:2.2">2.2 Accessing a table</A>
<UL>
<LI><A HREF="#sec:2.2.1">2.2.1 Finding record locations in a table</A>
<LI><A HREF="#sec:2.2.2">2.2.2 Reading records</A>
<LI><A HREF="#sec:2.2.3">2.2.3 Searching the table</A>
<LI><A HREF="#sec:2.2.4">2.2.4 Miscellaneous</A>
</UL>
</UL>
<LI><A HREF="#sec:3"><B>3 Flexible ordering and equivalence based on 
character table</B></A>
<LI><A HREF="#sec:4"><B>4 Example: accessing the Unix passwd file</B></A>
</UL>
</UL>

<H2><A NAME="sec:1">1 Introduction</A></H2>

<P>Prolog programs sometimes need access to large sets of background 
data. For example in the <font size=-1>GRASP</font> project we need 
access to ontologies of art objects, a large lexicon and translation 
dictionaries. Storage of such information as Prolog clauses is not 
sufficiently efficient in terms of the memory requirements.

<P>The table package outlined in this document allows for easy access of 
large structured files. The package uses binary search if possible and 
linear search for queries that cannot use more efficient algorithms 
without building additional index tables. Caching is achieved using the 
file-to-memory maps supported by many modern operating systems.

<P>The following sections define the interface predicates for the 
package.
<A HREF="#sec:example">Section 4</A> provides an example to access the 
Unix password file.

<H2><A NAME="sec:2">2 Managing external tables</A></H2>

<H3><A NAME="sec:2.1">2.1 Creating and destroying tables</A></H3>

<P>This section describes the predicates required for creating and 
destroying the access to external database tables.

<DL>

<P>
<DT><A NAME="new_table/4"><STRONG>new_table</STRONG>(<VAR>+File, 
+Columns, +Options, -Handle</VAR>)</A><DD>
Create a description of a new table, stored in <VAR>File</VAR>. <VAR>Columns</VAR> 
is a list of descriptions for each column. A column description is of 
the form
<BLOCKQUOTE>
<VAR>ColumnName</VAR><TT>(</TT><VAR>Type [, ColumnOptions]</VAR><TT>)</TT>
</BLOCKQUOTE>

<P><VAR>Type</VAR> denotes the Prolog type to which the field should be 
converted and is one of:

<P>
<CENTER>
<TABLE BORDER=2 FRAME=border RULES=group>
<TR VALIGN=top><TD><TT>integer</TT><TD>Convert to a Prolog integer. The 
input is treated as a decimal number. </TR>
<TR VALIGN=top><TD><TT>float</TT><TD>Convert to a Prolog floating point 
number. The input is handled by the C-library function
<TT>strtod()</TT>. </TR>
<TR VALIGN=top><TD><TT>atom</TT><TD>Convert to a Prolog atom. </TR>
<TR VALIGN=top><TD><TT>string</TT><TD>Convert to a SWI-Prolog string 
object. </TR>
<TR VALIGN=top><TD><TT>code_list</TT><TD>Convert to a list of <font size=-1>ASCII</font> 
codes. </TR>
</TABLE>

</CENTER>

<P><VAR>ColumnOptions</VAR> is a list of additional properties of the 
column. Supported values are:

<P>
<CENTER>
<TABLE BORDER=2 FRAME=border RULES=group>
<TR VALIGN=top><TD><TT>sorted</TT><TD>The field is strictly sorted, but 
may have (adjacent) duplicate entries. If the field is textual, it 
should be sorted alphabetically, otherwise it should be sorted 
numerically. </TR>
<TR VALIGN=top><TD><TT>sorted(+<VAR>Table</VAR>)</TT><TD>The (textual) 
field is sorted using the ordering declared by the named <EM>ordering 
table</EM>. This option may be used to define reverse order, 
`dictionary' order or other irregular alphabetical ordering. See
<A NAME="idx:newordertable2:1"></A><A HREF="#new_order_table/2">new_order_table/2</A>. </TR>
<TR VALIGN=top><TD><TT>unique</TT><TD>This column has distinct values 
for each row in the table. </TR>
<TR VALIGN=top><TD><TT>downcase</TT><TD>Map all uppercase in the field 
to lowercase before converting to a Prolog atom, string or code_list. </TR>
<TR VALIGN=top><TD><TT>map_space_to_underscore</TT><TD>Map spaces to 
underscores before converting to a Prolog atom, string or code_list. </TR>
<TR VALIGN=top><TD><TT>syntax</TT><TD>For numerical fields. If the field 
does not contain a valid number, matching the value fails. Reading the 
value returns the value as an atom. </TR>
<TR VALIGN=top><TD><TT>width(+<VAR>Chars</VAR>)</TT><TD>Field has fixed 
width of the specified number of characters. The column-separator is not 
considered for this column. </TR>
<TR VALIGN=top><TD><TT>arg(+<VAR>Index</VAR>)</TT><TD>For <A NAME="idx:readtablerecord4:2"></A><A HREF="#read_table_record/4">read_table_record/4</A>, 
unify the field with the given argument of the record term. Further 
fields will be assigned index+1, ... . </TR>
<TR VALIGN=top><TD><TT>skip</TT><TD>Don't convert this field to Prolog. 
The field is simply skipped without checking for consistency. </TR>
</TABLE>

</CENTER>

<P>The <VAR>Options</VAR> argument is a list of global options for the 
table. Defined options are:

<P>
<CENTER>
<TABLE BORDER=2 FRAME=border RULES=group>
<TR VALIGN=top><TD><TT>record_separator(+<VAR>Code</VAR>)</TT><TD>Character 
(<font size=-1>ASCII</font>) value of the character separating two 
records. Default is the newline (<font size=-1>ASCII</font> 10). </TR>
<TR VALIGN=top><TD><TT>field_separator(+<VAR>Code</VAR>)</TT><TD>Character 
(<font size=-1>ASCII</font>) value of the character separating two 
fields in a record. Default is the space (<font size=-1>ASCII</font> 
32), which also has a special meaning. Two fields separated by a space 
may be separated by any non-empty sequence of spaces and tab (<font size=-1>ASCII</font> 
9) characters. For all other separators, a single character separates 
the fields. </TR>
<TR VALIGN=top><TD><TT>escape(+<VAR>Code</VAR>, +<VAR>ListOfMap</VAR>)</TT><TD>Sometimes, 
a table defines escape sequences to make it possible to use the 
separator-characters in text-fields. This options provides a simple way 
to handle some standard cases. <VAR>Code</VAR> is the <font size=-1>ASCII</font> 
code of the character that leads the escape sequence. The default is
<TT>-1</TT>, and thus never matched.
<VAR>ListOfMap</VAR> is a list of
<VAR>From</VAR><TT> = </TT><VAR>To</VAR> character mappings. The default 
map table is the identity map, unless <VAR>Code</VAR> refers to the
<CODE>\</CODE> character, in which case
<CODE>\b</CODE>, <CODE>\e</CODE>, <CODE>\n</CODE>, <CODE>\r</CODE> and <CODE>\t</CODE> 
have their usual meaning. </TR>
<TR VALIGN=top><TD><TT>functor(<VAR>+Head</VAR>)</TT><TD>Functor used by <A NAME="idx:readtablerecord4:3"></A><A HREF="#read_table_record/4">read_table_record/4</A>. 
Default is <TT>record</TT> using the maximal argument index of the 
fields as arity. </TR>
</TABLE>

</CENTER>

<P>If the options are parsed successfully, <VAR>Handle</VAR> is unified 
with a term that may be used as a handle to the table for future 
operations on it. Note that <A NAME="idx:newtable4:4"></A><A HREF="#new_table/4">new_table/4</A> 
does not access the file system, so its success only indicates the 
description could be parsed, not the presence, access or format of the 
file.

<P>
<DT><A NAME="open_table/1"><STRONG>open_table</STRONG>(<VAR>+Handle</VAR>)</A><DD>
Open the table. This predicate normally does not need to be called 
explicitely, as all operations on the table handle will automatically 
open the table if this is required. It fails if the file cannot be 
accessed or some other error with the required operating-system 
resources occurs. The contents of the file is not examined by this 
predicate.

<P>
<DT><A NAME="close_table/1"><STRONG>close_table</STRONG>(<VAR>+Handle</VAR>)</A><DD>
Close the file and other system resources, but do not remove the 
description of the table, so it can be re-opened later.

<P>
<DT><A NAME="free_table/1"><STRONG>free_table</STRONG>(<VAR>+Handle</VAR>)</A><DD>
Close and remove the handle. After this operation, <VAR>Handle</VAR> 
becomes invalid and further references to it causes undefined behaviour.

<P>
</DL>

<H3><A NAME="sec:2.2">2.2 Accessing a table</A></H3>

<P>This section describes the predicates to read data from a table.

<H4><A NAME="sec:2.2.1">2.2.1 Finding record locations in a table</A></H4>

<P>Records are addressed by their offset in the table (file). As records 
have generally non-fixed length, searching is often required. The 
predicates below allow for finding records in the file.

<DL>

<P>
<DT><A NAME="get_table_attribute/3"><STRONG>get_table_attribute</STRONG>(<VAR>+Handle, 
+Attribute, -Value</VAR>)</A><DD>
Fetch attributes of the table. Defined attributes:

<P>
<TABLE BORDER=0 FRAME=void RULES=group>
<TR VALIGN=top><TD><TT>file</TT><TD>Unify value with the name of the 
file with which the table is associated. </TR>
<TR VALIGN=top><TD><TT>field(<VAR>N</VAR>)</TT><TD>Unify value with 
declaration of n-th (1-based) field. </TR>
<TR VALIGN=top><TD><TT>field_separator</TT><TD>Unify value with the 
field separator character. </TR>
<TR VALIGN=top><TD><TT>record_separator</TT><TD>Unify value with the 
record separator character. </TR>
<TR VALIGN=top><TD><TT>key_field</TT><TD>Unify value with the 1-based 
index of the field that is sorted or fails if the table contains no 
sorted fields. </TR>
<TR VALIGN=top><TD><TT>field_count</TT><TD>Unify value with the total 
number of columns in the table. </TR>
<TR VALIGN=top><TD><TT>size</TT><TD>Unify value with the number of 
characters in the table-file, <B>not</B> the number of records. </TR>
<TR VALIGN=top><TD><TT>window</TT><TD>Unify value with a term <VAR>Start</VAR><TT> 
- </TT><VAR>Size</VAR>, indicating the properties of the current window. </TR>
</TABLE>

<P>
<DT><A NAME="table_window/3"><STRONG>table_window</STRONG>(<VAR>+Handle, 
+Start, +Size</VAR>)</A><DD>
If only part of the file represents the table, this call may be used to 
define a window on the file. <VAR>Start</VAR> defines the start of the 
window relative to the start of the file. <VAR>Size</VAR> is the size in 
characters. Skipping a header is one of the possible purposes for this 
call.

<P>
<DT><A NAME="table_start_of_record/4"><STRONG>table_start_of_record</STRONG>(<VAR>+Handle, 
+From, +To, -Start</VAR>)</A><DD>
Enumerates (on backtracking) the start of records in the table in the 
region [From, To). Together with <A NAME="idx:readtablerecord4:5"></A><A HREF="#read_table_record/4">read_table_record/4</A>, 
this may be used to read the table's data.

<P>
<DT><A NAME="table_previous_record/3"><STRONG>table_previous_record</STRONG>(<VAR>+Handle, 
+Here, -Previous</VAR>)</A><DD>
If <VAR>Here</VAR> is the start of a record, find the start of the 
record before it. If <VAR>Here</VAR> points at an arbitrary location in 
a record, the start of this record will be returned.
</DL>

<H4><A NAME="sec:2.2.2">2.2.2 Reading records</A></H4>

<P>There are two predicates for reading records. The <A NAME="idx:readtablerecord4:6"></A><A HREF="#read_table_record/4">read_table_record/4</A> 
reads an entire record, while <A NAME="idx:readtablefields4:7"></A><A HREF="#read_table_fields/4">read_table_fields/4</A> 
reads one or more fields from a record.

<DL>

<P>
<DT><A NAME="read_table_record/4"><STRONG>read_table_record</STRONG>(<VAR>+Handle, 
+Start, -Next, -Record</VAR>)</A><DD>
Read a record from the table. <VAR>Handle</VAR> is a handle as returned 
by <A NAME="idx:newtable4:8"></A><A HREF="#new_table/4">new_table/4</A>. <VAR>Start</VAR> 
is the location of a record. If <VAR>Start</VAR> does not point to the 
start of a record, this predicate searches backwards for the starting 
position. <VAR>Record</VAR> is unified with a term constructed from the <VAR>functor</VAR> 
associated with the table (default name <TT>record</TT> and arity the 
number of not-skipped columns), each of the arguments containing the 
converted data. An error is raised if the data could not be converted. <VAR>Next</VAR> 
is unified with the start position for the next record.

<P>
<DT><A NAME="read_table_fields/4"><STRONG>read_table_fields</STRONG>(<VAR>+Handle, 
+Start, -Next, -Fields</VAR>)</A><DD>
As <A NAME="idx:readtablerecord4:9"></A><A HREF="#read_table_record/4">read_table_record/4</A>, 
but <VAR>Fields</VAR> is a list of terms
<VAR>+Name</VAR>(-<VAR>Value</VAR>), and the <VAR>Values</VAR> will be 
unified with the values of the specified field.

<P>
<DT><A NAME="read_table_record_data/4"><STRONG>read_table_record_data</STRONG>(<VAR>+Handle, 
+Start, -Next, -Record</VAR>)</A><DD>
Similar to <A NAME="idx:readtablerecord4:10"></A><A HREF="#read_table_record/4">read_table_record/4</A>, 
but unifies record with a Prolog string containing the data of the 
record unparsed. The returned record does <B>not</B> contain the 
terminating record-separator.

<P>
</DL>

<H4><A NAME="sec:2.2.3">2.2.3 Searching the table</A></H4>

<DL>

<P>
<DT><A NAME="in_table/3"><STRONG>in_table</STRONG>(<VAR>+Handle, 
?Fields, -RecordPos</VAR>)</A><DD>
Searches the table for records matching <VAR>Fields</VAR>. If a match is 
found, the variable (see below) fields in <VAR>Fields</VAR> are unified 
with the corresponding field value, and <VAR>RecordPos</VAR> is unified 
with the position of the record. The latter handle may be used in a 
subsequent call to <A NAME="idx:readtablerecord4:11"></A><A HREF="#read_table_record/4">read_table_record/4</A> 
or
<A NAME="idx:readtablefields4:12"></A><A HREF="#read_table_fields/4">read_table_fields/4</A>.

<P><VAR>Fields</VAR> is a list of field specifiers. Each specifier is of 
the format:
<BLOCKQUOTE>
<VAR>FieldName</VAR>(<VAR>Value</VAR> [, <VAR>Options</VAR>])
</BLOCKQUOTE>

<P><VAR>Options</VAR> is a list of options to specify the search. By 
default, the package will search for an exact match, possibly using the 
ordering table associated with the field (see <TT>order</TT> option in <A NAME="idx:newtable4:13"></A><A HREF="#new_table/4">new_table/4</A>). 
Options are:

<P>
<CENTER>
<TABLE BORDER=2 FRAME=border RULES=group>
<TR VALIGN=top><TD><TT>prefix</TT><TD>Uses prefix search with the 
default table. </TR>
<TR VALIGN=top><TD><TT>prefix(<VAR>Table</VAR>)</TT><TD>Uses prefix 
search with the specified ordering table. </TR>
<TR VALIGN=top><TD><TT>substring</TT><TD>Searches for a substring in the 
field. This requires linear search of the table. </TR>
<TR VALIGN=top><TD><TT>substring(<VAR>Table</VAR>)</TT><TD>Searches for 
a substring, using the table information for determining the equivalence 
of characters. </TR>
<TR VALIGN=top><TD><TT>=</TT><TD>Default equivalence. </TR>
<TR VALIGN=top><TD><TT>=(<VAR>Table</VAR>)</TT><TD>Equivalence using the 
given table. </TR>
</TABLE>

</CENTER>

<P>If <VAR>Value</VAR> is unbound (i.e. a variable), the record is 
considered not specified. The possible option list is ignored. If a 
match is found on the remaining fields, the variable is unified with the 
value found in the field.

<P>First, the system checks whether there is an ordered field that is 
specified. In this case, binary search is employed to find the matching 
record(s). Otherwise, linear search is used.

<P>If the match contains a specified field that has the property
<TT>unique</TT> set (see <A NAME="idx:newtable4:14"></A><A HREF="#new_table/4">new_table/4</A>), <A NAME="idx:intable3:15"></A><A HREF="#in_table/3">in_table/3</A> 
succeeds deterministically. Otherwise it will create a backtrack-point 
and backtracking will yield further solutions to the query.

<P><A NAME="idx:intable3:16"></A><A HREF="#in_table/3">in_table/3</A> 
may be comfortable used to bind the table transparently to a predicate. 
For example, we have a file with lines of the format.<A NAME=back-to-note-1 HREF="index.html#note-1"> (1)</A>

<P>
<TABLE WIDTH="90%" ALIGN=center BORDER=6 BGCOLOR="#e0e0e0"><TR><TD NOWRAP>
<PRE>

    C1C2,Full Name
    </PRE>
</TABLE>

<P><VAR>C1C2</VAR> is a two-character identifier used in the other 
tables, and <VAR>FullName</VAR> is the description of the identifier. We 
want to have a predicate identifier_name(?Id, ?FullName) to reflect this 
table. The code below does the trick:

<P>
<TABLE WIDTH="90%" ALIGN=center BORDER=6 BGCOLOR="#e0e0e0"><TR><TD NOWRAP>
<PRE>

    :- dynamic stored_idtable_handle/1.


    idtable(Handle) :-
            stored_idtable_handle(Handle).
    idtable(Handle) :-
            new_table('rootdisp.dat',
                      [ id(atom, [downcase, sorted, unique]),
                        name(atom),
                      ],
                      [ field_separator(0',)
                      ], Handle),
            asserta(stored_idtable_handle(Handle)).

    identifier_name(Id, Name) :-
            idtable(Handle),
            in_table(Handle, [id(Id), name(Name)], _).
    </PRE>
</TABLE>

<P>
</DL>

<H4><A NAME="sec:2.2.4">2.2.4 Miscellaneous</A></H4>

<DL>

<P>
<DT><A NAME="table_version/2"><STRONG>table_version</STRONG>(<VAR>-Version, 
-CompileDate</VAR>)</A><DD>
Unify <VAR>Version</VAR> with an atom identifying the version of this 
package, and <VAR>CompileDate</VAR> with the date this package was 
compiled.
</DL>

<H2><A NAME="sec:3">3 Flexible ordering and equivalence based on 
character table</A></H2>

<P>This package was developed as part of the <font size=-1>GRASP</font> 
project, where it is used for browsing lexical and ontology information, 
which is normally stored using `dictionary' order, rather than the more 
conventional alphabetical ordering based on character codes. To achieve 
programmable ordering, the table package defines `order tables'. An 
order table is a table with the cardinality of the size of the character 
set (256 for extended <font size=-1>ASCII</font>), and maps each 
character onto its `order number', and some characters onto special 
codes.

<P>The default (<TT>exact</TT>) table matches all character codes onto 
themselves. The default <TT>case_insensitive</TT> table matches all 
uppercase characters onto their corresponding lowercase character. The 
tables <TT>iso_latin_1</TT> and <TT>iso_latin_1_case_insensitive</TT> 
map the ISO-latin-1 letters with diacritics into their plain 
counterpart.

<P>To support dictionary ordering, the following special categories are 
defined:

<P>
<CENTER>
<TABLE BORDER=2 FRAME=border RULES=group>
<TR VALIGN=top><TD>ignore<TD>Characters of the ignore set are simple 
discarded from the input. </TR>
<TR VALIGN=top><TD>break<TD>Characters from the break set are treated as 
word-breaks, and each non-empty sequence of them is considered equal. A 
word break precedes a normal character. </TR>
<TR VALIGN=top><TD>tag<TD>Characters of type tag indicate the start of a 
`tag' that should not be considered in ordering, unless both strings are 
the same upto the tag. </TR>
</TABLE>

</CENTER>

<P>The following predicates are defined to manage and use these tables:

<DL>

<P>
<DT><A NAME="new_order_table/2"><STRONG>new_order_table</STRONG>(<VAR>+Name, 
+Options</VAR>)</A><DD>
Create a new, or replace the order-table with the given name (an atom). <VAR>Options</VAR> 
is a list of options:

<P>
<CENTER>
<TABLE BORDER=2 FRAME=border RULES=group>
<TR VALIGN=top><TD><TT>case_insensitive</TT><TD>Map all upper- to 
lowercase characters. </TR>
<TR VALIGN=top><TD><TT>iso_latin_1</TT><TD>Start with an ISO-Latin-1 
table </TR>
<TR VALIGN=top><TD><TT>iso_latin_1_case_insensitive</TT><TD>Start with a 
case-insensitive ISO-Latin-1 table </TR>
<TR VALIGN=top><TD><TT>copy(+<VAR>Table</VAR>)</TT><TD>Copy all entries 
from <VAR>Table</VAR>. </TR>
<TR VALIGN=top><TD><TT>tag(+<VAR>ListOfCodes</VAR>)</TT><TD>Add these 
characters to the set of `tag' characters. </TR>
<TR VALIGN=top><TD><TT>ignore(+<VAR>ListOfCodes</VAR>)</TT><TD>Add these 
characters to the set of `ignore' characters. </TR>
<TR VALIGN=top><TD><TT>break(+<VAR>ListOfCodes</VAR>)</TT><TD>Add these 
characters to the set of `break' characters. </TR>
<TR VALIGN=top><TD><TT>+<VAR>Code1</VAR> = +<VAR>Code2</VAR> </TT><TD>Map <VAR>Code1</VAR> 
onto <VAR>Code2</VAR>. </TR>
</TABLE>

</CENTER>

<P>
<DT><A NAME="order_table_mapping/3"><STRONG>order_table_mapping</STRONG>(<VAR>+Table, 
?From, ?To</VAR>)</A><DD>
Read the current mapping. <VAR>To</VAR> is a character code or one of 
the atoms <CODE>break</CODE>, <CODE>ignore</CODE> or <CODE>tag</CODE>.

<P>
<DT><A NAME="compare_strings/4"><STRONG>compare_strings</STRONG>(<VAR>+Table, 
+S1, +S2, -Result</VAR>)</A><DD>
Compare two strings using the named <VAR>Table</VAR>. <VAR>S1</VAR> and
<VAR>S2</VAR> may be atoms, strings or code-lists. <VAR>Result</VAR> is 
one of the atoms <CODE>&lt;</CODE>, <CODE>=</CODE> or <CODE>&gt;</CODE>.

<P>
<DT><A NAME="prefix_string/3"><STRONG>prefix_string</STRONG>(<VAR>+Table, 
+Prefix, +String</VAR>)</A><DD>
Succeeds if <VAR>Prefix</VAR> is a prefix of <VAR>String</VAR> using the 
named
<VAR>Table</VAR>.

<P>
<DT><A NAME="prefix_string/4"><STRONG>prefix_string</STRONG>(<VAR>+Table, 
+Prefix, -Rest, +String</VAR>)</A><DD>
Succeeds if <VAR>Prefix</VAR> is a prefix of <VAR>String</VAR> using the 
named
<VAR>Table</VAR>, and <VAR>Rest</VAR> is unified with the remainder of
<VAR>String</VAR> that is not matched. Please note that the existence of 
an order-table implies simple contatenation using <A NAME="idx:concat3:17"></A><B>concat/3</B> 
cannot be used to determine the non-matched part of the string.

<P>
<DT><A NAME="sub_string/3"><STRONG>sub_string</STRONG>(<VAR>+Table, 
+Sub, +String</VAR>)</A><DD>
Succeeds if <VAR>Sub</VAR> is a substring of <VAR>String</VAR> using the 
named <VAR>Table</VAR>.
</DL>

<H2><A NAME="sec:4">4 Example: accessing the Unix passwd file</A></H2>

<A NAME="sec:example"></A>

<P>The Unix passwd file is a file with records spanning a single line 
each. The fields are separated by a single `:' character. Here is an 
example of a line:

<P><font size=-1>
<TABLE WIDTH="90%" ALIGN=center BORDER=6 BGCOLOR="#e0e0e0"><TR><TD NOWRAP>
<PRE>

joe:hgdu3r3bce:53:100:Joe Johnson:/users/joe:/bin/bash
</PRE>
</TABLE>

<P></font>

<P>The following call defines a table for it:

<P>
<TABLE WIDTH="90%" ALIGN=center BORDER=6 BGCOLOR="#e0e0e0"><TR><TD NOWRAP>
<PRE>

?- new_table('/etc/passwd',
             [ user(atom),
               passwd(code_list),
               uid(integer),
               gid(integer),
               gecos(code_list),
               homedir(atom),
               shell(atom)
             ],
             [ field_separator(0':)
             ],
             H).
</PRE>
</TABLE>

<P>To find all people of group <VAR>100</VAR>, use:

<P>
<TABLE WIDTH="90%" ALIGN=center BORDER=6 BGCOLOR="#e0e0e0"><TR><TD NOWRAP>
<PRE>

?- findall(User, in_table(H, [user(User), gid(100)], _), Users).
</PRE>
</TABLE>

<H1><A NAME="document-notes">Footnotes</A></H1>

<DL>
<DT><A NAME=note-1 HREF="index.html#back-to-note-1">note-1</A><DD>
This is the <TT>disproot.dat</TT> table from the <font size=-1>AAT</font> 
database used in <font size=-1>GRASP</font>
</DL>

</BODY></HTML>