1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613
|
<HTML>
<HEAD>
<TITLE>Managing external tables for SWI-Prolog</TITLE>
</HEAD>
<BODY BGCOLOR="white">
<BLOCKQUOTE>
<BLOCKQUOTE>
<BLOCKQUOTE>
<BLOCKQUOTE>
<CENTER>
<H1>Managing external tables for SWI-Prolog</H1>
</CENTER>
<HR>
<CENTER>
<I>Jan Wielemaker <BR>
SWI, <BR>
University of Amsterdam <BR>
The Netherlands <BR>
E-mail: <A HREF="mailto:jan@swi.psy.uva.nl">jan@swi.psy.uva.nl</A></I>
</CENTER>
<HR>
</BLOCKQUOTE>
</BLOCKQUOTE>
</BLOCKQUOTE>
</BLOCKQUOTE>
<CENTER><H3>Abstract</H3></Center>
<TABLE WIDTH="90%" ALIGN=center BORDER=2 BGCOLOR="#f0f0f0"><TR><TD>
This document describes a foreign language extension to
<A HREF="http://www.swi.psy.uva.nl/projects/SWI-Prolog">SWI-Prolog</A>
for the manipulation of `external tables'. External tables are files
using a textual representation of records separated into fields. The
package allows for a flexible definition of the format of the file in
terms of records and fields, how the information in the file should be
mapped onto Prolog data types and what properties the file has to
improve the performance of lookup.
<P>The table package has been used successfully to deal with large
static databases such as dictionaries. Compared to loading the tables
into the Prolog database, this approach required much less memory and
loads much faster while providing reasonable lookup-performance on
sorted tables.
<P>This package uses read-only `mapping' of the database file into
memory and is ported to Win32 (Windows 95 and NT) as well as Unix
systems providing the mmap() system call (Solaris, SunOs, Linux and many
more modern Unices).
</TABLE>
<H1><A NAME="document-contents">Table of Contents</A></H1>
<UL>
<UL>
<LI><A HREF="#sec:1"><B>1 Introduction</B></A>
<LI><A HREF="#sec:2"><B>2 Managing external tables</B></A>
<UL>
<LI><A HREF="#sec:2.1">2.1 Creating and destroying tables</A>
<LI><A HREF="#sec:2.2">2.2 Accessing a table</A>
<UL>
<LI><A HREF="#sec:2.2.1">2.2.1 Finding record locations in a table</A>
<LI><A HREF="#sec:2.2.2">2.2.2 Reading records</A>
<LI><A HREF="#sec:2.2.3">2.2.3 Searching the table</A>
<LI><A HREF="#sec:2.2.4">2.2.4 Miscellaneous</A>
</UL>
</UL>
<LI><A HREF="#sec:3"><B>3 Flexible ordering and equivalence based on
character table</B></A>
<LI><A HREF="#sec:4"><B>4 Example: accessing the Unix passwd file</B></A>
</UL>
</UL>
<H2><A NAME="sec:1">1 Introduction</A></H2>
<P>Prolog programs sometimes need access to large sets of background
data. For example in the <font size=-1>GRASP</font> project we need
access to ontologies of art objects, a large lexicon and translation
dictionaries. Storage of such information as Prolog clauses is not
sufficiently efficient in terms of the memory requirements.
<P>The table package outlined in this document allows for easy access of
large structured files. The package uses binary search if possible and
linear search for queries that cannot use more efficient algorithms
without building additional index tables. Caching is achieved using the
file-to-memory maps supported by many modern operating systems.
<P>The following sections define the interface predicates for the
package.
<A HREF="#sec:example">Section 4</A> provides an example to access the
Unix password file.
<H2><A NAME="sec:2">2 Managing external tables</A></H2>
<H3><A NAME="sec:2.1">2.1 Creating and destroying tables</A></H3>
<P>This section describes the predicates required for creating and
destroying the access to external database tables.
<DL>
<P>
<DT><A NAME="new_table/4"><STRONG>new_table</STRONG>(<VAR>+File,
+Columns, +Options, -Handle</VAR>)</A><DD>
Create a description of a new table, stored in <VAR>File</VAR>. <VAR>Columns</VAR>
is a list of descriptions for each column. A column description is of
the form
<BLOCKQUOTE>
<VAR>ColumnName</VAR><TT>(</TT><VAR>Type [, ColumnOptions]</VAR><TT>)</TT>
</BLOCKQUOTE>
<P><VAR>Type</VAR> denotes the Prolog type to which the field should be
converted and is one of:
<P>
<CENTER>
<TABLE BORDER=2 FRAME=border RULES=group>
<TR VALIGN=top><TD><TT>integer</TT><TD>Convert to a Prolog integer. The
input is treated as a decimal number. </TR>
<TR VALIGN=top><TD><TT>float</TT><TD>Convert to a Prolog floating point
number. The input is handled by the C-library function
<TT>strtod()</TT>. </TR>
<TR VALIGN=top><TD><TT>atom</TT><TD>Convert to a Prolog atom. </TR>
<TR VALIGN=top><TD><TT>string</TT><TD>Convert to a SWI-Prolog string
object. </TR>
<TR VALIGN=top><TD><TT>code_list</TT><TD>Convert to a list of <font size=-1>ASCII</font>
codes. </TR>
</TABLE>
</CENTER>
<P><VAR>ColumnOptions</VAR> is a list of additional properties of the
column. Supported values are:
<P>
<CENTER>
<TABLE BORDER=2 FRAME=border RULES=group>
<TR VALIGN=top><TD><TT>sorted</TT><TD>The field is strictly sorted, but
may have (adjacent) duplicate entries. If the field is textual, it
should be sorted alphabetically, otherwise it should be sorted
numerically. </TR>
<TR VALIGN=top><TD><TT>sorted(+<VAR>Table</VAR>)</TT><TD>The (textual)
field is sorted using the ordering declared by the named <EM>ordering
table</EM>. This option may be used to define reverse order,
`dictionary' order or other irregular alphabetical ordering. See
<A NAME="idx:newordertable2:1"></A><A HREF="#new_order_table/2">new_order_table/2</A>. </TR>
<TR VALIGN=top><TD><TT>unique</TT><TD>This column has distinct values
for each row in the table. </TR>
<TR VALIGN=top><TD><TT>downcase</TT><TD>Map all uppercase in the field
to lowercase before converting to a Prolog atom, string or code_list. </TR>
<TR VALIGN=top><TD><TT>map_space_to_underscore</TT><TD>Map spaces to
underscores before converting to a Prolog atom, string or code_list. </TR>
<TR VALIGN=top><TD><TT>syntax</TT><TD>For numerical fields. If the field
does not contain a valid number, matching the value fails. Reading the
value returns the value as an atom. </TR>
<TR VALIGN=top><TD><TT>width(+<VAR>Chars</VAR>)</TT><TD>Field has fixed
width of the specified number of characters. The column-separator is not
considered for this column. </TR>
<TR VALIGN=top><TD><TT>arg(+<VAR>Index</VAR>)</TT><TD>For <A NAME="idx:readtablerecord4:2"></A><A HREF="#read_table_record/4">read_table_record/4</A>,
unify the field with the given argument of the record term. Further
fields will be assigned index+1, ... . </TR>
<TR VALIGN=top><TD><TT>skip</TT><TD>Don't convert this field to Prolog.
The field is simply skipped without checking for consistency. </TR>
</TABLE>
</CENTER>
<P>The <VAR>Options</VAR> argument is a list of global options for the
table. Defined options are:
<P>
<CENTER>
<TABLE BORDER=2 FRAME=border RULES=group>
<TR VALIGN=top><TD><TT>record_separator(+<VAR>Code</VAR>)</TT><TD>Character
(<font size=-1>ASCII</font>) value of the character separating two
records. Default is the newline (<font size=-1>ASCII</font> 10). </TR>
<TR VALIGN=top><TD><TT>field_separator(+<VAR>Code</VAR>)</TT><TD>Character
(<font size=-1>ASCII</font>) value of the character separating two
fields in a record. Default is the space (<font size=-1>ASCII</font>
32), which also has a special meaning. Two fields separated by a space
may be separated by any non-empty sequence of spaces and tab (<font size=-1>ASCII</font>
9) characters. For all other separators, a single character separates
the fields. </TR>
<TR VALIGN=top><TD><TT>escape(+<VAR>Code</VAR>, +<VAR>ListOfMap</VAR>)</TT><TD>Sometimes,
a table defines escape sequences to make it possible to use the
separator-characters in text-fields. This options provides a simple way
to handle some standard cases. <VAR>Code</VAR> is the <font size=-1>ASCII</font>
code of the character that leads the escape sequence. The default is
<TT>-1</TT>, and thus never matched.
<VAR>ListOfMap</VAR> is a list of
<VAR>From</VAR><TT> = </TT><VAR>To</VAR> character mappings. The default
map table is the identity map, unless <VAR>Code</VAR> refers to the
<CODE>\</CODE> character, in which case
<CODE>\b</CODE>, <CODE>\e</CODE>, <CODE>\n</CODE>, <CODE>\r</CODE> and <CODE>\t</CODE>
have their usual meaning. </TR>
<TR VALIGN=top><TD><TT>functor(<VAR>+Head</VAR>)</TT><TD>Functor used by <A NAME="idx:readtablerecord4:3"></A><A HREF="#read_table_record/4">read_table_record/4</A>.
Default is <TT>record</TT> using the maximal argument index of the
fields as arity. </TR>
</TABLE>
</CENTER>
<P>If the options are parsed successfully, <VAR>Handle</VAR> is unified
with a term that may be used as a handle to the table for future
operations on it. Note that <A NAME="idx:newtable4:4"></A><A HREF="#new_table/4">new_table/4</A>
does not access the file system, so its success only indicates the
description could be parsed, not the presence, access or format of the
file.
<P>
<DT><A NAME="open_table/1"><STRONG>open_table</STRONG>(<VAR>+Handle</VAR>)</A><DD>
Open the table. This predicate normally does not need to be called
explicitely, as all operations on the table handle will automatically
open the table if this is required. It fails if the file cannot be
accessed or some other error with the required operating-system
resources occurs. The contents of the file is not examined by this
predicate.
<P>
<DT><A NAME="close_table/1"><STRONG>close_table</STRONG>(<VAR>+Handle</VAR>)</A><DD>
Close the file and other system resources, but do not remove the
description of the table, so it can be re-opened later.
<P>
<DT><A NAME="free_table/1"><STRONG>free_table</STRONG>(<VAR>+Handle</VAR>)</A><DD>
Close and remove the handle. After this operation, <VAR>Handle</VAR>
becomes invalid and further references to it causes undefined behaviour.
<P>
</DL>
<H3><A NAME="sec:2.2">2.2 Accessing a table</A></H3>
<P>This section describes the predicates to read data from a table.
<H4><A NAME="sec:2.2.1">2.2.1 Finding record locations in a table</A></H4>
<P>Records are addressed by their offset in the table (file). As records
have generally non-fixed length, searching is often required. The
predicates below allow for finding records in the file.
<DL>
<P>
<DT><A NAME="get_table_attribute/3"><STRONG>get_table_attribute</STRONG>(<VAR>+Handle,
+Attribute, -Value</VAR>)</A><DD>
Fetch attributes of the table. Defined attributes:
<P>
<TABLE BORDER=0 FRAME=void RULES=group>
<TR VALIGN=top><TD><TT>file</TT><TD>Unify value with the name of the
file with which the table is associated. </TR>
<TR VALIGN=top><TD><TT>field(<VAR>N</VAR>)</TT><TD>Unify value with
declaration of n-th (1-based) field. </TR>
<TR VALIGN=top><TD><TT>field_separator</TT><TD>Unify value with the
field separator character. </TR>
<TR VALIGN=top><TD><TT>record_separator</TT><TD>Unify value with the
record separator character. </TR>
<TR VALIGN=top><TD><TT>key_field</TT><TD>Unify value with the 1-based
index of the field that is sorted or fails if the table contains no
sorted fields. </TR>
<TR VALIGN=top><TD><TT>field_count</TT><TD>Unify value with the total
number of columns in the table. </TR>
<TR VALIGN=top><TD><TT>size</TT><TD>Unify value with the number of
characters in the table-file, <B>not</B> the number of records. </TR>
<TR VALIGN=top><TD><TT>window</TT><TD>Unify value with a term <VAR>Start</VAR><TT>
- </TT><VAR>Size</VAR>, indicating the properties of the current window. </TR>
</TABLE>
<P>
<DT><A NAME="table_window/3"><STRONG>table_window</STRONG>(<VAR>+Handle,
+Start, +Size</VAR>)</A><DD>
If only part of the file represents the table, this call may be used to
define a window on the file. <VAR>Start</VAR> defines the start of the
window relative to the start of the file. <VAR>Size</VAR> is the size in
characters. Skipping a header is one of the possible purposes for this
call.
<P>
<DT><A NAME="table_start_of_record/4"><STRONG>table_start_of_record</STRONG>(<VAR>+Handle,
+From, +To, -Start</VAR>)</A><DD>
Enumerates (on backtracking) the start of records in the table in the
region [From, To). Together with <A NAME="idx:readtablerecord4:5"></A><A HREF="#read_table_record/4">read_table_record/4</A>,
this may be used to read the table's data.
<P>
<DT><A NAME="table_previous_record/3"><STRONG>table_previous_record</STRONG>(<VAR>+Handle,
+Here, -Previous</VAR>)</A><DD>
If <VAR>Here</VAR> is the start of a record, find the start of the
record before it. If <VAR>Here</VAR> points at an arbitrary location in
a record, the start of this record will be returned.
</DL>
<H4><A NAME="sec:2.2.2">2.2.2 Reading records</A></H4>
<P>There are two predicates for reading records. The <A NAME="idx:readtablerecord4:6"></A><A HREF="#read_table_record/4">read_table_record/4</A>
reads an entire record, while <A NAME="idx:readtablefields4:7"></A><A HREF="#read_table_fields/4">read_table_fields/4</A>
reads one or more fields from a record.
<DL>
<P>
<DT><A NAME="read_table_record/4"><STRONG>read_table_record</STRONG>(<VAR>+Handle,
+Start, -Next, -Record</VAR>)</A><DD>
Read a record from the table. <VAR>Handle</VAR> is a handle as returned
by <A NAME="idx:newtable4:8"></A><A HREF="#new_table/4">new_table/4</A>. <VAR>Start</VAR>
is the location of a record. If <VAR>Start</VAR> does not point to the
start of a record, this predicate searches backwards for the starting
position. <VAR>Record</VAR> is unified with a term constructed from the <VAR>functor</VAR>
associated with the table (default name <TT>record</TT> and arity the
number of not-skipped columns), each of the arguments containing the
converted data. An error is raised if the data could not be converted. <VAR>Next</VAR>
is unified with the start position for the next record.
<P>
<DT><A NAME="read_table_fields/4"><STRONG>read_table_fields</STRONG>(<VAR>+Handle,
+Start, -Next, -Fields</VAR>)</A><DD>
As <A NAME="idx:readtablerecord4:9"></A><A HREF="#read_table_record/4">read_table_record/4</A>,
but <VAR>Fields</VAR> is a list of terms
<VAR>+Name</VAR>(-<VAR>Value</VAR>), and the <VAR>Values</VAR> will be
unified with the values of the specified field.
<P>
<DT><A NAME="read_table_record_data/4"><STRONG>read_table_record_data</STRONG>(<VAR>+Handle,
+Start, -Next, -Record</VAR>)</A><DD>
Similar to <A NAME="idx:readtablerecord4:10"></A><A HREF="#read_table_record/4">read_table_record/4</A>,
but unifies record with a Prolog string containing the data of the
record unparsed. The returned record does <B>not</B> contain the
terminating record-separator.
<P>
</DL>
<H4><A NAME="sec:2.2.3">2.2.3 Searching the table</A></H4>
<DL>
<P>
<DT><A NAME="in_table/3"><STRONG>in_table</STRONG>(<VAR>+Handle,
?Fields, -RecordPos</VAR>)</A><DD>
Searches the table for records matching <VAR>Fields</VAR>. If a match is
found, the variable (see below) fields in <VAR>Fields</VAR> are unified
with the corresponding field value, and <VAR>RecordPos</VAR> is unified
with the position of the record. The latter handle may be used in a
subsequent call to <A NAME="idx:readtablerecord4:11"></A><A HREF="#read_table_record/4">read_table_record/4</A>
or
<A NAME="idx:readtablefields4:12"></A><A HREF="#read_table_fields/4">read_table_fields/4</A>.
<P><VAR>Fields</VAR> is a list of field specifiers. Each specifier is of
the format:
<BLOCKQUOTE>
<VAR>FieldName</VAR>(<VAR>Value</VAR> [, <VAR>Options</VAR>])
</BLOCKQUOTE>
<P><VAR>Options</VAR> is a list of options to specify the search. By
default, the package will search for an exact match, possibly using the
ordering table associated with the field (see <TT>order</TT> option in <A NAME="idx:newtable4:13"></A><A HREF="#new_table/4">new_table/4</A>).
Options are:
<P>
<CENTER>
<TABLE BORDER=2 FRAME=border RULES=group>
<TR VALIGN=top><TD><TT>prefix</TT><TD>Uses prefix search with the
default table. </TR>
<TR VALIGN=top><TD><TT>prefix(<VAR>Table</VAR>)</TT><TD>Uses prefix
search with the specified ordering table. </TR>
<TR VALIGN=top><TD><TT>substring</TT><TD>Searches for a substring in the
field. This requires linear search of the table. </TR>
<TR VALIGN=top><TD><TT>substring(<VAR>Table</VAR>)</TT><TD>Searches for
a substring, using the table information for determining the equivalence
of characters. </TR>
<TR VALIGN=top><TD><TT>=</TT><TD>Default equivalence. </TR>
<TR VALIGN=top><TD><TT>=(<VAR>Table</VAR>)</TT><TD>Equivalence using the
given table. </TR>
</TABLE>
</CENTER>
<P>If <VAR>Value</VAR> is unbound (i.e. a variable), the record is
considered not specified. The possible option list is ignored. If a
match is found on the remaining fields, the variable is unified with the
value found in the field.
<P>First, the system checks whether there is an ordered field that is
specified. In this case, binary search is employed to find the matching
record(s). Otherwise, linear search is used.
<P>If the match contains a specified field that has the property
<TT>unique</TT> set (see <A NAME="idx:newtable4:14"></A><A HREF="#new_table/4">new_table/4</A>), <A NAME="idx:intable3:15"></A><A HREF="#in_table/3">in_table/3</A>
succeeds deterministically. Otherwise it will create a backtrack-point
and backtracking will yield further solutions to the query.
<P><A NAME="idx:intable3:16"></A><A HREF="#in_table/3">in_table/3</A>
may be comfortable used to bind the table transparently to a predicate.
For example, we have a file with lines of the format.<A NAME=back-to-note-1 HREF="index.html#note-1"> (1)</A>
<P>
<TABLE WIDTH="90%" ALIGN=center BORDER=6 BGCOLOR="#e0e0e0"><TR><TD NOWRAP>
<PRE>
C1C2,Full Name
</PRE>
</TABLE>
<P><VAR>C1C2</VAR> is a two-character identifier used in the other
tables, and <VAR>FullName</VAR> is the description of the identifier. We
want to have a predicate identifier_name(?Id, ?FullName) to reflect this
table. The code below does the trick:
<P>
<TABLE WIDTH="90%" ALIGN=center BORDER=6 BGCOLOR="#e0e0e0"><TR><TD NOWRAP>
<PRE>
:- dynamic stored_idtable_handle/1.
idtable(Handle) :-
stored_idtable_handle(Handle).
idtable(Handle) :-
new_table('rootdisp.dat',
[ id(atom, [downcase, sorted, unique]),
name(atom),
],
[ field_separator(0',)
], Handle),
asserta(stored_idtable_handle(Handle)).
identifier_name(Id, Name) :-
idtable(Handle),
in_table(Handle, [id(Id), name(Name)], _).
</PRE>
</TABLE>
<P>
</DL>
<H4><A NAME="sec:2.2.4">2.2.4 Miscellaneous</A></H4>
<DL>
<P>
<DT><A NAME="table_version/2"><STRONG>table_version</STRONG>(<VAR>-Version,
-CompileDate</VAR>)</A><DD>
Unify <VAR>Version</VAR> with an atom identifying the version of this
package, and <VAR>CompileDate</VAR> with the date this package was
compiled.
</DL>
<H2><A NAME="sec:3">3 Flexible ordering and equivalence based on
character table</A></H2>
<P>This package was developed as part of the <font size=-1>GRASP</font>
project, where it is used for browsing lexical and ontology information,
which is normally stored using `dictionary' order, rather than the more
conventional alphabetical ordering based on character codes. To achieve
programmable ordering, the table package defines `order tables'. An
order table is a table with the cardinality of the size of the character
set (256 for extended <font size=-1>ASCII</font>), and maps each
character onto its `order number', and some characters onto special
codes.
<P>The default (<TT>exact</TT>) table matches all character codes onto
themselves. The default <TT>case_insensitive</TT> table matches all
uppercase characters onto their corresponding lowercase character. The
tables <TT>iso_latin_1</TT> and <TT>iso_latin_1_case_insensitive</TT>
map the ISO-latin-1 letters with diacritics into their plain
counterpart.
<P>To support dictionary ordering, the following special categories are
defined:
<P>
<CENTER>
<TABLE BORDER=2 FRAME=border RULES=group>
<TR VALIGN=top><TD>ignore<TD>Characters of the ignore set are simple
discarded from the input. </TR>
<TR VALIGN=top><TD>break<TD>Characters from the break set are treated as
word-breaks, and each non-empty sequence of them is considered equal. A
word break precedes a normal character. </TR>
<TR VALIGN=top><TD>tag<TD>Characters of type tag indicate the start of a
`tag' that should not be considered in ordering, unless both strings are
the same upto the tag. </TR>
</TABLE>
</CENTER>
<P>The following predicates are defined to manage and use these tables:
<DL>
<P>
<DT><A NAME="new_order_table/2"><STRONG>new_order_table</STRONG>(<VAR>+Name,
+Options</VAR>)</A><DD>
Create a new, or replace the order-table with the given name (an atom). <VAR>Options</VAR>
is a list of options:
<P>
<CENTER>
<TABLE BORDER=2 FRAME=border RULES=group>
<TR VALIGN=top><TD><TT>case_insensitive</TT><TD>Map all upper- to
lowercase characters. </TR>
<TR VALIGN=top><TD><TT>iso_latin_1</TT><TD>Start with an ISO-Latin-1
table </TR>
<TR VALIGN=top><TD><TT>iso_latin_1_case_insensitive</TT><TD>Start with a
case-insensitive ISO-Latin-1 table </TR>
<TR VALIGN=top><TD><TT>copy(+<VAR>Table</VAR>)</TT><TD>Copy all entries
from <VAR>Table</VAR>. </TR>
<TR VALIGN=top><TD><TT>tag(+<VAR>ListOfCodes</VAR>)</TT><TD>Add these
characters to the set of `tag' characters. </TR>
<TR VALIGN=top><TD><TT>ignore(+<VAR>ListOfCodes</VAR>)</TT><TD>Add these
characters to the set of `ignore' characters. </TR>
<TR VALIGN=top><TD><TT>break(+<VAR>ListOfCodes</VAR>)</TT><TD>Add these
characters to the set of `break' characters. </TR>
<TR VALIGN=top><TD><TT>+<VAR>Code1</VAR> = +<VAR>Code2</VAR> </TT><TD>Map <VAR>Code1</VAR>
onto <VAR>Code2</VAR>. </TR>
</TABLE>
</CENTER>
<P>
<DT><A NAME="order_table_mapping/3"><STRONG>order_table_mapping</STRONG>(<VAR>+Table,
?From, ?To</VAR>)</A><DD>
Read the current mapping. <VAR>To</VAR> is a character code or one of
the atoms <CODE>break</CODE>, <CODE>ignore</CODE> or <CODE>tag</CODE>.
<P>
<DT><A NAME="compare_strings/4"><STRONG>compare_strings</STRONG>(<VAR>+Table,
+S1, +S2, -Result</VAR>)</A><DD>
Compare two strings using the named <VAR>Table</VAR>. <VAR>S1</VAR> and
<VAR>S2</VAR> may be atoms, strings or code-lists. <VAR>Result</VAR> is
one of the atoms <CODE><</CODE>, <CODE>=</CODE> or <CODE>></CODE>.
<P>
<DT><A NAME="prefix_string/3"><STRONG>prefix_string</STRONG>(<VAR>+Table,
+Prefix, +String</VAR>)</A><DD>
Succeeds if <VAR>Prefix</VAR> is a prefix of <VAR>String</VAR> using the
named
<VAR>Table</VAR>.
<P>
<DT><A NAME="prefix_string/4"><STRONG>prefix_string</STRONG>(<VAR>+Table,
+Prefix, -Rest, +String</VAR>)</A><DD>
Succeeds if <VAR>Prefix</VAR> is a prefix of <VAR>String</VAR> using the
named
<VAR>Table</VAR>, and <VAR>Rest</VAR> is unified with the remainder of
<VAR>String</VAR> that is not matched. Please note that the existence of
an order-table implies simple contatenation using <A NAME="idx:concat3:17"></A><B>concat/3</B>
cannot be used to determine the non-matched part of the string.
<P>
<DT><A NAME="sub_string/3"><STRONG>sub_string</STRONG>(<VAR>+Table,
+Sub, +String</VAR>)</A><DD>
Succeeds if <VAR>Sub</VAR> is a substring of <VAR>String</VAR> using the
named <VAR>Table</VAR>.
</DL>
<H2><A NAME="sec:4">4 Example: accessing the Unix passwd file</A></H2>
<A NAME="sec:example"></A>
<P>The Unix passwd file is a file with records spanning a single line
each. The fields are separated by a single `:' character. Here is an
example of a line:
<P><font size=-1>
<TABLE WIDTH="90%" ALIGN=center BORDER=6 BGCOLOR="#e0e0e0"><TR><TD NOWRAP>
<PRE>
joe:hgdu3r3bce:53:100:Joe Johnson:/users/joe:/bin/bash
</PRE>
</TABLE>
<P></font>
<P>The following call defines a table for it:
<P>
<TABLE WIDTH="90%" ALIGN=center BORDER=6 BGCOLOR="#e0e0e0"><TR><TD NOWRAP>
<PRE>
?- new_table('/etc/passwd',
[ user(atom),
passwd(code_list),
uid(integer),
gid(integer),
gecos(code_list),
homedir(atom),
shell(atom)
],
[ field_separator(0':)
],
H).
</PRE>
</TABLE>
<P>To find all people of group <VAR>100</VAR>, use:
<P>
<TABLE WIDTH="90%" ALIGN=center BORDER=6 BGCOLOR="#e0e0e0"><TR><TD NOWRAP>
<PRE>
?- findall(User, in_table(H, [user(User), gid(100)], _), Users).
</PRE>
</TABLE>
<H1><A NAME="document-notes">Footnotes</A></H1>
<DL>
<DT><A NAME=note-1 HREF="index.html#back-to-note-1">note-1</A><DD>
This is the <TT>disproot.dat</TT> table from the <font size=-1>AAT</font>
database used in <font size=-1>GRASP</font>
</DL>
</BODY></HTML>
|