1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<!-- This document was generated using DocBuilder 3.3.3 -->
<HTML>
<HEAD>
<TITLE>qlc</TITLE>
<SCRIPT type="text/javascript" src="../../../../doc/erlresolvelinks.js">
</SCRIPT>
<STYLE TYPE="text/css">
<!--
.REFBODY { margin-left: 13mm }
.REFTYPES { margin-left: 8mm }
-->
</STYLE>
</HEAD>
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#FF00FF"
ALINK="#FF0000">
<!-- refpage -->
<CENTER>
<A HREF="http://www.erlang.se">
<IMG BORDER=0 ALT="[Ericsson AB]" SRC="min_head.gif">
</A>
<H1>qlc</H1>
</CENTER>
<H3>MODULE</H3>
<DIV CLASS=REFBODY>
qlc
</DIV>
<H3>MODULE SUMMARY</H3>
<DIV CLASS=REFBODY>
Query Interface to Mnesia, ETS, Dets, etc
</DIV>
<H3>DESCRIPTION</H3>
<DIV CLASS=REFBODY>
<P>The <CODE>qlc</CODE> module provides a query interface to Mnesia, ETS,
Dets and other data structures that implement an iterator style
traversal of objects.
</DIV>
<H3>Overview</H3>
<DIV CLASS=REFBODY>
<P>The <CODE>qlc</CODE> module implements a query interface to <STRONG>QLC
tables</STRONG>. Typical QLC tables are ETS, Dets, and Mnesia
tables. There is also support for user defined tables, see the
<A HREF="#implementing_a_qlc_table">Implementing a QLC
table</A> section. A <STRONG>query</STRONG> is stated using
<STRONG>Query List Comprehensions</STRONG> (QLCs). These are similar to
ordinary list comprehensions as described in the Erlang
Reference Manual and Programming Examples except that variables
introduced in patterns cannot be used in list expressions. The
answers to a query are determined by data in QLC tables that
fulfill the constraints expressed by the QLCs of the query.
<P>QLCs should not be confused with the language construct
<CODE>query ListComprehension end</CODE> used by Mnemosyne. The
<CODE>qlc</CODE> module recognizes the first argument of every call to
<CODE>qlc:q/1,2</CODE> as QLCs, and nothing else. The semantics are
very different: Mnemosyne uses ideas borrowed from Prolog while
the QLCs introduced in this module are all Erlang. In fact, in
the absence of optimizations and options such as <CODE>cache</CODE>
and <CODE>unique</CODE> (see below), every QLC free of QLC tables
evaluates to the same list of answers as the identical ordinary
list comprehension. It is the aim of this module to replace
Mnemosyne and to be more versatile by means of QLC tables.
<P>While ordinary list comprehensions evaluate to lists, calling
<A HREF="#q">qlc:q/1,2</A> returns a <STRONG>Query
Handle</STRONG>. To obtain all the answers to a query, <A HREF="#eval">qlc:eval/1,2</A> should be called with the
query handle as first argument. Query handles are essentially
functions created in the module calling <CODE>q/1,2</CODE>. As the
functions refer to the module's code, one should be careful not
to keep query handles too long if the module's code is to be
replaced.
Code replacement is described in the <A HREF="javascript:erlhref('../../../../', 'doc/reference_manual', 'code.html');">Erlang Reference
Manual</A>. The list of answers can also be traversed in
chunks by use of a <STRONG>Query Cursor</STRONG>. Query cursors are
created by calling <A HREF="#cursor">qlc:cursor/1,2</A> with a query handle as
first argument. Query cursors are essentially Erlang processes.
One answer at a time is sent from the query cursor process to
the process that created the cursor.
</DIV>
<H3>Syntax</H3>
<DIV CLASS=REFBODY>
<P>Syntactically QLCs have the same parts as ordinary list
comprehensions:
<P>
<PRE>
[Expression || Qualifier1, Qualifier2, ...]
</PRE>
<P><CODE>Expression</CODE> (the <STRONG>template</STRONG>) is an arbitrary
Erlang expression. Qualifiers are either <STRONG>filters</STRONG> or
<STRONG>generators</STRONG>. Filters are Erlang expressions returning
<CODE>bool()</CODE>. Generators have the form
<CODE>Pattern<-ListExpression</CODE>, where
<CODE>ListExpression</CODE> is an expression evaluating to a query
handle or a list. Query handles are returned from
<CODE>qlc:table/2</CODE>, <CODE>qlc:append/1,2</CODE>, <CODE>qlc:sort/1,2</CODE>,
<CODE>qlc:keysort/2,3</CODE>, <CODE>qlc:q/1,2</CODE>, and
<CODE>qlc:string_to_handle/1,2,3</CODE>.
</DIV>
<H3>Evaluation</H3>
<DIV CLASS=REFBODY>
<P>The evaluation of a query handle begins by the inspection of
options and the collection of information about tables. As a
result qualifiers are modified during the optimization phase.
Next all list expressions are evaluated. If a cursor has been
created evaluation takes place in the cursor process. For those
list expressions that are QLCs, the list expressions of the
QLCs' generators are evaluated as well. One has to be careful if
list expressions have side effects since the order in which list
expressions are evaluated is unspecified. Finally the answers
are found by evaluating the qualifiers from left to right,
backtracking when some filter returns <CODE>false</CODE>, or
collecting the template when all filters return <CODE>true</CODE>.
<P>Filters that do not return <CODE>bool()</CODE> but fail are handled
differently depending on their syntax: if the filter is a guard
it returns <CODE>false</CODE>, otherwise the query evaluation fails.
This behavior makes it possible for QLC to do some optimizations
without affecting the meaning of a query. For example, when some
position of a table is compared to one or more constants, only
the objects with matching values are candidates for further
evaluation. The other objects are guaranteed to make the filter
return <CODE>false</CODE>, but never fail. The (small) set of
candidate objects can often be found by looking up some key
values of the table or by traversing the table using a match
specification. It is necessary to place the guard filters
immediately after the table's generator, otherwise it could
happen that some table object that would make the query
evaluation fail is excluded by looking up a key or running a
match specification.
</DIV>
<H3>Join</H3>
<DIV CLASS=REFBODY>
<P>QLC supports fast join of two query handles. Fast join is
possible if some position (<CODE>P1</CODE>) of one query handler is
compared to or matched against some position (<CODE>P2</CODE>) of
another query handle. Two fast join methods have been
implemented:
<P>
<UL>
<LI>
Lookup join traverses all objects of one query handle and
finds objects of the other handle (a QLC table) such that the
values at <CODE>P1</CODE> and <CODE>P2</CODE> match. QLC does not create
any indices but looks up values using the key position and
the indexed positions of the QLC table.
</LI>
<LI>
Merge join sorts the objects of each query handle if
necessary and filters out objects where the values at
<CODE>P1</CODE> and <CODE>P2</CODE> do not compare equal. If there are
many objects with the same value of <CODE>P2</CODE> a temporary
file will be used for the equivalence classes.
</LI>
</UL>
<P>QLC warns at compile time if a query list comprehension
combines QLC handles in such a way that more than one join is
possible. In other words, there is no query planner that can
choose a good order between possible join operations. It is up
to the user to order the joins by introducing query handles.
<P>The join is to be expressed as a guard filter. The filter must
be placed immediately after the two joined generators, possibly
after guard filters that use variables from no other generators
but the two joined generators. QLC inspects the operands of
<CODE>=:=/2</CODE>, <CODE>==/2</CODE>, <CODE>is_record/2</CODE>, <CODE>element/2</CODE>,
and logical operators (<CODE>and/2</CODE>, <CODE>or/2</CODE>,
<CODE>andalso/2</CODE>, <CODE>orelse/2</CODE>, <CODE>xor/2</CODE>) when
determining which joins to consider.
</DIV>
<H3>Common options</H3>
<DIV CLASS=REFBODY>
<P>The following options are accepted by <CODE>cursor/2</CODE>,
<CODE>eval/2</CODE>, <CODE>fold/4</CODE>, and <CODE>info/2</CODE>:
<P>
<UL>
<LI>
<CODE>{cache_all, CacheAll}</CODE> where <CODE>CacheAll</CODE> is
equal to <CODE>ets</CODE> or <CODE>list</CODE> adds a
<CODE>{cache,CacheAll}</CODE> option to every list expression
of the query except tables and lists. Default is
<CODE>{cache_all,no}</CODE>. The option <CODE>cache_all</CODE> is
equivalent to <CODE>{cache_all,ets}</CODE>.
</LI>
<LI>
<CODE>{max_list_size, MaxListSize}</CODE> <A NAME="max_list_size"><!-- Empty --></A> where <CODE>MaxListSize</CODE> is the size in
bytes of terms on the external format. If the accumulated size
of collected objects exceeds <CODE>MaxListSize</CODE> the objects
are written onto a temporary file. This option is used by the
<CODE>{cache,list}</CODE> option as well as by the merge join
method. Default is 512*1024 bytes.
</LI>
<LI>
<CODE>{tmpdir, TempDirectory}</CODE> sets the directory used by
merge join for temporary files and by the
<CODE>{cache,list} option. The option also overrides the
<c>tmpdir</CODE> option of <CODE>keysort/3</CODE> and <CODE>sort/2</CODE>.
The default value is <CODE>""</CODE> which means that the directory
returned by <CODE>file:get_cwd()</CODE> is used.
</LI>
<LI>
<CODE>{unique_all, true}</CODE> adds a
<CODE>{unique,true}</CODE> option to every list expression of
the query. Default is <CODE>{unique_all,false}</CODE>. The
option <CODE>unique_all</CODE> is equivalent to
<CODE>{unique_all,true}</CODE>.
</LI>
</UL>
</DIV>
<H3>Common data types</H3>
<DIV CLASS=REFBODY>
<P>
<UL>
<LI>
<CODE>QueryCursor = {qlc_cursor, term()}</CODE>
<BR>
</LI>
<LI>
<CODE>QueryHandle = {qlc_handle, term()}</CODE>
<BR>
</LI>
<LI>
<CODE>QueryHandleOrList = QueryHandle | list()</CODE>
<BR>
</LI>
<LI>
<CODE>Answers = [Answer]</CODE>
<BR>
</LI>
<LI>
<CODE>Answer = term()</CODE>
<BR>
</LI>
<LI>
<CODE>AbstractExpression =</CODE>
-parse trees for Erlang expressions, see the
<A HREF="javascript:erlhref('../../../../', 'erts', 'absform.html');"> abstract format</A> documentation in
ERTS User's Guide-
<BR>
</LI>
<LI>
<CODE>MatchExpression =</CODE>
-matchspecifications,
see the <A HREF="javascript:erlhref('../../../../', 'erts', 'match_spec.html');"> match specification</A> documentation in the
ERTS User's Guide and <A HREF="ms_transform.html"> ms_transform(3)</A>-
<BR>
</LI>
<LI>
<CODE>SpawnOptions = default | spawn_options()</CODE>
<BR>
</LI>
<LI>
<CODE>SortOptions = [SortOption] | SortOption</CODE>
<BR>
</LI>
<LI>
<CODE>SortOption = {compressed, bool()}
| {no_files, NoFiles}
| {order, Order}
| {size, Size}
| {tmpdir, TempDirectory}
| {unique, bool()}</CODE>
-see <A HREF="file_sorter.html">file_sorter(3)
</A>-
<BR>
</LI>
<LI>
<CODE>Order = ascending | descending | OrderFun</CODE>
<BR>
</LI>
<LI>
<CODE>OrderFun = fun(Term, Term) -> bool()</CODE>
<BR>
</LI>
<LI>
<CODE>TempDirectory = "" | filename()</CODE>
<BR>
</LI>
<LI>
<CODE>Size = int() > 0</CODE>
<BR>
</LI>
<LI>
<CODE>NoFiles = int() > 1</CODE>
<BR>
</LI>
<LI>
<CODE>KeyPos = int() > 0 | [int() > 0]</CODE>
<BR>
</LI>
<LI>
<CODE>MaxListSize = int() >= 0</CODE>
<BR>
</LI>
<LI>
<CODE>bool() = true | false</CODE>
<BR>
</LI>
<LI>
<CODE>Cache = ets | list | no</CODE>
<BR>
</LI>
<LI>
<CODE>filename() =</CODE>
-see <A HREF="filename.html">filename(3)
</A>-
<BR>
</LI>
<LI>
<CODE>spawn_options() =</CODE>
-see <A HREF="javascript:erlhref('../../../../', 'kernel', 'erlang.html');"> erlang(3)</A>-
<BR>
</LI>
</UL>
</DIV>
<H3>Getting started</H3>
<DIV CLASS=REFBODY>
<P><A NAME="getting_started"><!-- Empty --></A>As already mentioned queries are
stated in the list comprehension syntax as described in the
<A HREF="javascript:erlhref('../../../../', 'doc/reference_manual', 'expressions.html');">Erlang
Reference Manual</A>. In the following some familiarity
with list comprehensions is assumed. There are examples in
<A HREF="javascript:erlhref('../../../../', 'doc/programming_examples', 'list_comprehensions.html');"> Programming Examples</A> that can get you started. It
should be stressed that list comprehensions do not add any
computational power to the language; anything that can be done
with list comprehensions can also be done without them. But they
add a syntax for expressing simple search problems which is
compact and clear once you get used to it.
<P>Many list comprehension expressions can be evaluated by the
<CODE>qlc</CODE> module. Exceptions are expressions such that
variables introduced in patterns (or filters) are used in some
generator later in the list comprehension. As an example
consider an implementation of lists:append(L): <CODE>[X || Y <- L,
X <- Y]</CODE>. Y is introduced in the first generator and used in
the second. The ordinary list comprehension is normally to be
preferred when there is a choice as to which to use. One
difference is that <CODE>qlc:eval/1,2</CODE> collects answers in a
list which is finally reversed, while list comprehensions
collect answers on the stack which is finally unwound.
<P>What the <CODE>qlc</CODE> module primarily adds to list
comprehensions is that data can be read from QLC tables in small
chunks. A QLC table is created by calling <CODE>qlc:table/2</CODE>.
Usually <CODE>qlc:table/2</CODE> is not called directly from the query
but via an interface function of some data structure. There are
a few examples of such functions in Erlang/OTP:
<CODE>mnesia:table/1,2</CODE>, <CODE>ets:table/1,2</CODE>, and
<CODE>dets:table/1,2</CODE>. For a given data structure there can be
several functions that create QLC tables, but common for all
these functions is that they return a query handle created by
<CODE>qlc:table/2</CODE>. Using the QLC tables provided by OTP is
probably sufficient in most cases, but for the more advanced
user the section <A HREF="#implementing_a_qlc_table">Implementing a QLC
table</A> describes the implementation of a function
calling <CODE>qlc:table/2</CODE>.
<P>Besides <CODE>qlc:table/2</CODE> there are other functions that
return query handles. They might not be used as often as tables,
but are useful from time to time. <CODE>qlc:append</CODE> traverses
objects from several tables or lists after each other. If, for
instance, you want to traverse all answers to a query QH and
then finish off by a term <CODE>{finished}</CODE>, you can do that by
calling <CODE>qlc:append(QH, [{finished}])</CODE>. <CODE>append</CODE> first
returns all objects of QH, then <CODE>{finished}</CODE>. If there is
one tuple <CODE>{finished}</CODE> among the answers to QH it will be
returned twice from <CODE>append</CODE>.
<P>As another example, consider concatenating the answers to two
queries QH1 and QH2 while removing all duplicates. The means to
accomplish this is to use the <CODE>unique</CODE> option:
<PRE>
qlc:q([X || X <- qlc:append(QH1, QH2)], {unique, true})
</PRE>
<P>The cost is substantial: every returned answer will be stored
in an ETS table. Before returning an answer it is looked up in
the ETS table to check if it has already been returned. Without
the <CODE>unique</CODE> options all answers to QH1 would be returned
followed by all answers to QH2. The <CODE>unique</CODE> options keeps
the order between the remaining answers.
<P>If the order of the answers is not important there is the
alternative to sort the answers uniquely:
<PRE>
qlc:sort(qlc:q([X || X <- qlc:append(QH1, QH2)], {unique, true})).
</PRE>
<P>This query also removes duplicates but the answers will be
sorted. If there are many answers temporary files will be used.
Note that in order to get the first unique answer all answers
have to be found and sorted.
<P>To return just a few answers cursors can be used. The following
code returns no more than five answers using an ETS table for
storing the unique answers:
<PRE>
C = qlc:cursor(qlc:q([X || X <- qlc:append(QH1, QH2)],{unique,true})),
R = qlc:next_answers(C, 5),
ok = qlc:delete_cursor(C),
R.
</PRE>
<P>Query list comprehensions are convenient for stating
constraints on data from two or more tables. An example that
does a natural join on two query handles on position 2:
<PRE>
qlc:q([{X1,X2,X3,Y1} ||
{X1,X2,X3} <- QH1,
{Y1,Y2} <- QH2,
X2 =:= Y2])
</PRE>
<P>QLC will evaluate this differently depending on the query
handles <CODE>QH1</CODE> and <CODE>QH2</CODE>. If, for example, <CODE>X2</CODE> is
matched against the key of a QLC table the lookup join method
will traverse the objects of <CODE>QH2</CODE> while looking up key
values in the table. On the other hand, if neither <CODE>X2</CODE> nor
<CODE>Y2</CODE> is matched against the key or an indexed position of a
QLC table, the merge join method will make sure that <CODE>QH1</CODE>
and <CODE>QH2</CODE> are both sorted on position 2 and next do the
join by traversing the objects one by one.
<P>The <CODE>join</CODE> option can be used to force QLC to use a
certain join method. For the rest of this section it is assumed
that the excessively slow join method called "nested loop" has
been chosen:
<PRE>
qlc:q([{X1,X2,X3,Y1} ||
{X1,X2,X3} <- QH1,
{Y1,Y2} <- QH2,
X2 =:= Y2],
{join, nested_loop})
</PRE>
<P>In this case the filter will be applied to every possible pair
of answers to QH1 and QH2, one at a time. If there are M answers
to QH1 and N answers to QH2 the filter will be run M*N times.
<P>If QH2 is a call to the function for <CODE>gb_trees</CODE> as defined
in the <A HREF="#implementing_a_qlc_table">Implementing
a QLC table</A> section, <CODE>gb_table:table/1</CODE>, the
iterator for the gb-tree will be initiated for each answer to
QH1 after which the objects of the gb-tree will be returned one
by one. This is probably the most efficient way of traversing
the table in that case since it takes minimal computational
power to get the following object. But if QH2 is not a table but a more
complicated QLC, it can be more efficient use some RAM memory
for collecting the answers in a cache, particularly if there are
only a few answers. It must then be assumed that evaluating QH2
has no side effects so that the meaning of the query does not
change if QH2 is evaluated only once. One way of caching the
answers is to evaluate QH2 first of all and substitute the list
of answers for QH2 in the query. Another way is to use the
<CODE>cache</CODE> option. It is stated like this:
<PRE>
QH2' = qlc:q([X || X <- QH2], {cache, ets})
</PRE>
<P>or just
<PRE>
QH2' = qlc:q([X || X <- QH2], cache)
</PRE>
<P>The effect of the <CODE>cache</CODE> option is that when the
generator QH2' is run the first time every answer is stored in
an ETS table. When next answer of QH1 is tried, answers to QH2'
are copied from the ETS table which is very fast. As for the
<CODE>unique</CODE> option the cost is a possibly substantial amount
of RAM memory. The <CODE>{cache,list}</CODE> option offers the
possibility to store the answers in a list on the process heap.
While this has the potential of being faster than ETS tables
since there is no need to copy answers from the table it can
often result in slower evaluation due to more garbage
collections of the process' heap as well as increased RAM memory
consumption due to bigger heaps. Another drawback with cache
lists is that if the size of the list exceeds a limit a
temporary file will be used. Reading the answers from a file is
very much slower than copying them from an ETS table. But if the
available RAM memory is scarce setting the <A HREF="#max_list_size">limit</A> to some low value is an
alternative.
<P>There is an option <CODE>cache_all</CODE> that can be set to
<CODE>ets</CODE> or <CODE>list</CODE> when evaluating a query. It adds a
<CODE>cache</CODE> or <CODE>{cache,list}</CODE> option to every list
expression except QLC tables and lists on all levels of the
query. This can be used for testing if caching would improve
efficiency at all. If the answer is yes further testing is
needed to pinpoint the generators that should be cached.
</DIV>
<H3>Implementing a QLC table</H3>
<DIV CLASS=REFBODY>
<P><A NAME="implementing_a_qlc_table"><!-- Empty --></A>As an example of how to
use the <A HREF="#q">qlc:table/2</A> function the
implementation of a QLC table for the <A HREF="gb_trees.html">gb_trees</A> module is given:
<PRE>
-module(gb_table).
-import(gb_trees, [iterator/1, lookup/2, next/1]).
-export([table/1]).
table(T) ->
TF = fun() -> qlc_next(next(iterator(T))) end,
InfoFun = fun(num_of_objects) -> size(T);
(keypos) -> 1;
(_) -> undefined
end,
LookupFun =
fun(1, Ks) ->
lists:flatmap(fun(K) ->
case gb_trees:lookup(K, T) of
{value, V} -> [{K,V}];
none -> []
end
end, Ks)
end,
FormatFun =
fun(all) ->
Vals = a_few(T),
{gb_trees, from_orddict, [Vals]};
({lookup, 1, KeyValues}) ->
ValsS = io_lib:format("gb_trees:from_orddict(~w)",
[a_few(T)]),
io_lib:format("lists:flatmap(fun(K) -> "
"case gb_trees:lookup(K, ~s) of "
"{value, V} -> [{K,V}];none -> [] end "
"end, ~w)",
[ValsS, KeyValues])
end,
qlc:table(TF, [{info_fun, InfoFun}, {format_fun, FormatFun},
{lookup_fun, LookupFun}]).
qlc_next({X, V, S}) ->
[{X,V} | fun() -> qlc_next(next(S)) end];
qlc_next(none) ->
[].
a_few(T) ->
a_few(iterator(T), 7).
a_few(_I, 0) ->
more;
a_few(I0, N) ->
case next(I0) of
{X, V, I} ->
[{X,V} | a_few(I, N-1)];
none ->
[]
end.
</PRE>
<P><CODE>TF</CODE> is the traversal function. The <CODE>qlc</CODE> module
requires that there is a way of traversing all objects of the
data structure; in <CODE>gb_trees</CODE> there is an iterator function
suitable for that purpose. Note that for each object returned a
new fun is created. As long as the list is not terminated by
<CODE>[]</CODE> it is assumed that the tail of the list is a nullary
function and that calling the function returns further objects
(and functions).
<P>The lookup function is optional. It is assumed that the lookup
function always finds values much faster than it would take to
traverse the table. The first argument is the position of the
key. Since <CODE>qlc_next</CODE> returns the objects as
{Key,Value} pairs the position is 1. Note that the lookup
function should return {Key,Value} pairs, just as the
traversal function does.
<P>The format function is also optional. It is called by
<CODE>qlc:info</CODE> to give feedback at runtime of how the query
will be evaluated. One should try to give as good feedback as
possible without showing too much details. In the example at
most 7 objects of the table are shown. The format function
handles two cases: <CODE>all</CODE> means that all objects of the
table will be traversed; <CODE>{lookup,1,KeyValues}</CODE>
means that the lookup function will be used for looking up key
values.
<P>Whether the whole table will be traversed or just some keys
looked up depends on how the query is stated. If the query has
the form
<PRE>
qlc:q([T || P <- LE, F])
</PRE>
<P>and P is a tuple, the <CODE>qlc</CODE> module analyzes P and F in
compile time to find positions of the tuple P that are matched
or compared to constants. If such a position at runtime turns
out to be the key position, the lookup function can be used,
otherwise all objects of the table have to be traversed. It is
the info function <CODE>InfoFun</CODE> that returns the key position.
There can be indexed positions as well, also returned by the info
function. An index is an extra table that makes lookup on some
position fast. Mnesia maintains indices upon request, thereby
introducing so called secondary keys. The key is always
preferred before secondary keys regardless of the number of
constants to look up.
</DIV>
<H3>EXPORTS</H3>
<P><A NAME="append/1"><STRONG><CODE>append(QHL) -> QH</CODE></STRONG></A><BR>
<DIV CLASS=REFBODY><P>Types:
<DIV CLASS=REFTYPES>
<P>
<STRONG><CODE>QHL = [QueryHandleOrList]</CODE></STRONG><BR>
<STRONG><CODE>QH = QueryHandle</CODE></STRONG><BR>
</DIV>
</DIV>
<DIV CLASS=REFBODY>
<P>Returns a query handle. When evaluating the query handle
<CODE>QH</CODE> all answers to the first query handle in
<CODE>QHL</CODE> is returned followed by all answers to the rest
of the query handles in <CODE>QHL</CODE>.
</DIV>
<P><A NAME="append/2"><STRONG><CODE>append(QH1, QH2) -> QH3</CODE></STRONG></A><BR>
<DIV CLASS=REFBODY><P>Types:
<DIV CLASS=REFTYPES>
<P>
<STRONG><CODE>QH1 = QH2 = QueryHandleOrList</CODE></STRONG><BR>
<STRONG><CODE>QH3 = QueryHandle</CODE></STRONG><BR>
</DIV>
</DIV>
<DIV CLASS=REFBODY>
<P>Returns a query handle. When evaluating the query handle
<CODE>QH3</CODE> all answers to <CODE>QH1</CODE> are returned followed
by all answers to <CODE>QH2</CODE>.
<P><CODE>append(QH1,QH2)</CODE> is equivalent to
<CODE>append([QH1,QH2])</CODE>.
</DIV>
<P><A NAME="cursor/2"><STRONG><CODE>cursor(QueryHandleOrList [, Options]) -> QueryCursor</CODE></STRONG></A><BR>
<DIV CLASS=REFBODY><P>Types:
<DIV CLASS=REFTYPES>
<P>
<STRONG><CODE>Options = [Option] | Option</CODE></STRONG><BR>
<STRONG><CODE>Option = {cache_all, Cache} | cache_all
| {max_list_size, MaxListSize}
| {spawn_options, SpawnOptions}
| {tmpdir, TempDirectory}
| {unique_all, bool()} | unique_all</CODE></STRONG><BR>
</DIV>
</DIV>
<DIV CLASS=REFBODY>
<P><A NAME="cursor"><!-- Empty --></A>Creates a query cursor and makes the
calling process the owner of the cursor. The cursor is to be
used as argument to <CODE>next_answers/1,2</CODE> and (eventually)
<CODE>delete_cursor/1</CODE>. Calls <CODE>erlang:spawn_opt</CODE> to
spawn and link a process which will evaluate the query
handle. The value of the option <CODE>spawn_options</CODE> is used
as last argument when calling <CODE>spawn_opt</CODE>. The default
value is <CODE>[link]</CODE>.
<PRE>
1> <STRONG>QH = qlc:q([{X,Y} || X <- [a,b], Y <- [1,2]]),</STRONG>
<STRONG>QC = qlc:cursor(QH),</STRONG>
<STRONG>qlc:next_answers(QC, 1).</STRONG>
[{a,1}]
2> <STRONG>qlc:next_answers(QC, 1).</STRONG>
[{a,2}]
3> <STRONG>qlc:next_answers(QC, all_remaining).</STRONG>
[{b,1},{b,2}]
4> <STRONG>qlc:delete_cursor(QC).</STRONG>
ok
</PRE>
</DIV>
<P><A NAME="delete_cursor/1"><STRONG><CODE>delete_cursor(QueryCursor) -> ok</CODE></STRONG></A><BR>
<DIV CLASS=REFBODY>
<P>Deletes a query cursor. Only the owner of the cursor can
delete the cursor.
</DIV>
<P><A NAME="eval/2"><STRONG><CODE>eval(QueryHandleOrList [, Options]) -> Answers | Error</CODE></STRONG></A><BR>
<A NAME="e/2"><STRONG><CODE>e(QueryHandleOrList [, Options]) -> Answers</CODE></STRONG></A><BR>
<DIV CLASS=REFBODY><P>Types:
<DIV CLASS=REFTYPES>
<P>
<STRONG><CODE>Options = [Option] | Option</CODE></STRONG><BR>
<STRONG><CODE>Option = {cache_all, Cache} | cache_all
| {max_list_size, MaxListSize}
| {tmpdir, TempDirectory}
| {unique_all, bool()} | unique_all</CODE></STRONG><BR>
<STRONG><CODE>Error = {error, module(), Reason}</CODE></STRONG><BR>
<STRONG><CODE>Reason =-as returned by file_sorter(3)-</CODE></STRONG><BR>
</DIV>
</DIV>
<DIV CLASS=REFBODY>
<P><A NAME="eval"><!-- Empty --></A>Evaluates a query handle in the calling
process and collects all answers in a list.
<PRE>
1> <STRONG>QH = qlc:q([{X,Y} || X <- [a,b], Y <- [1,2]]),</STRONG>
<STRONG>qlc:eval(QH).</STRONG>
[{a,1},{a,2},{b,1},{b,2}]
</PRE>
</DIV>
<P><A NAME="fold/4"><STRONG><CODE>fold(Function, Acc0, QueryHandleOrList [, Options]) ->
Acc1 | Error</CODE></STRONG></A><BR>
<DIV CLASS=REFBODY><P>Types:
<DIV CLASS=REFTYPES>
<P>
<STRONG><CODE>Function = fun(Answer, AccIn) -> AccOut</CODE></STRONG><BR>
<STRONG><CODE>Acc0 = Acc1 = AccIn = AccOut = term()</CODE></STRONG><BR>
<STRONG><CODE>Options = [Option] | Option</CODE></STRONG><BR>
<STRONG><CODE>Option = {cache_all, Cache} | cache_all
| {max_list_size, MaxListSize}
| {tmpdir, TempDirectory}
| {unique_all, bool()} | unique_all</CODE></STRONG><BR>
<STRONG><CODE>Error = {error, module(), Reason}</CODE></STRONG><BR>
<STRONG><CODE>Reason =-as returned by file_sorter(3)-</CODE></STRONG><BR>
</DIV>
</DIV>
<DIV CLASS=REFBODY>
<P>Calls <CODE>Function</CODE> on successive answers to the query
handle together with an extra argument <CODE>AccIn</CODE>. The
query handle and the function are evaluated in the calling
process. <CODE>Function</CODE> must return a new accumulator which
is passed to the next call. <CODE>Acc0</CODE> is returned if there
are no answers to the query handle.
<PRE>
1> <STRONG>QH = [1,2,3,4,5,6],</STRONG>
<STRONG>qlc:fold(fun(X, Sum) -> X + Sum end, 0, QH).</STRONG>
21
</PRE>
</DIV>
<P><A NAME="format_error/1"><STRONG><CODE>format_error(Error) -> Chars</CODE></STRONG></A><BR>
<DIV CLASS=REFBODY><P>Types:
<DIV CLASS=REFTYPES>
<P>
<STRONG><CODE>Error = {error, module(), term()}</CODE></STRONG><BR>
<STRONG><CODE>Chars = [char() | Chars]</CODE></STRONG><BR>
</DIV>
</DIV>
<DIV CLASS=REFBODY>
<P>Returns a descriptive string in English of an error tuple
returned by some of the functions of the <CODE>qlc</CODE> module
or the parse transform. This function is mainly used by the
compiler invoking the parse transform.
</DIV>
<P><A NAME="info/2"><STRONG><CODE>info(QueryHandleOrList [, Options]) -> Info</CODE></STRONG></A><BR>
<DIV CLASS=REFBODY><P>Types:
<DIV CLASS=REFTYPES>
<P>
<STRONG><CODE>Options = [Option] | Option</CODE></STRONG><BR>
<STRONG><CODE>Option = EvalOption | ReturnOption</CODE></STRONG><BR>
<STRONG><CODE>EvalOption = {cache_all, Cache} | cache_all
| {max_list_size, MaxListSize}
| {tmpdir, TempDirectory}
| {unique_all, bool()} | unique_all</CODE></STRONG><BR>
<STRONG><CODE>ReturnOption = {flat, bool()}
| {format, Format}
| {n_elements, NElements}</CODE></STRONG><BR>
<STRONG><CODE>Format = abstract_code
| string</CODE></STRONG><BR>
<STRONG><CODE>NElements = infinity
| int() > 0</CODE></STRONG><BR>
<STRONG><CODE>Info = AbstractExpression
| string()</CODE></STRONG><BR>
</DIV>
</DIV>
<DIV CLASS=REFBODY>
<P>Returns information about a query handle. The information
describes the simplifications and optimizations that are the
results of preparing the query for evaluation. This function
is probably useful mostly during debugging.
<P>The information has the form of an Erlang expression where
QLCs most likely occur. Depending on the format functions of
mentioned QLC tables it may not be absolutely accurate.
<P>The default is to return a sequence of QLCs in a block, but
if the option <CODE>{flat,false}</CODE> is given, one single
QLC is returned. The default is to return a string, but if
the option <CODE>{format,abstract_code}</CODE> is given,
abstract code is returned instead. The default is to return
all elements in lists, but if the
<CODE>{n_elements,NElements}</CODE> option is given, only a
limited number of elements are returned.
<PRE>
1> <STRONG>QH = qlc:q([{X,Y} || X <- [x,y], Y <- [a,b]]),</STRONG>
<STRONG>io:format("~s~n", [qlc:info(QH, unique_all)]).</STRONG>
begin
V1 = qlc:q([SQV || SQV <- [x,y]], [{unique,true}]),
V2 = qlc:q([SQV || SQV <- [a,b]], [{unique,true}]),
qlc:q([{X,Y} || X <- V1, Y <- V2], [{unique,true}])
end
</PRE>
<P>In this example two simple QLCs have been inserted just to
hold the <CODE>{unique,true}</CODE> option.
<PRE>
1> <STRONG>E1 = ets:new(e1, []),</STRONG>
<STRONG>E2 = ets:new(e2, []),</STRONG>
<STRONG>true = ets:insert(E1, [{1,a},{2,b}]),</STRONG>
<STRONG>true = ets:insert(E2, [{a,1},{b,2}]),</STRONG>
<STRONG>Q = qlc:q([{X,Z,W} ||</STRONG>
<STRONG>{X, Z} <- ets:table(E1),</STRONG>
<STRONG>{W, Y} <- ets:table(E2),</STRONG>
<STRONG>X =:= Y]),</STRONG>
<STRONG>io:format("~s~n", [qlc:info(Q)]).</STRONG>
begin
V1 = qlc:q([P0 || P0 = {W,Y} <- ets:table(18)]),
V2 =
qlc:q([[G1|G2] ||
G2 <- V1,
G1 <- ets:table(17),
element(2, G1) =:= element(1, G2)],
[{join,lookup}]),
qlc:q([{X,Z,W} || [{X,Z}|{W,Y}] <- V2, X =:= Y])
end
</PRE>
<P>In this example the query list comprehension <CODE>V2</CODE> has
been inserted to show the joined generators and the join
method chosen. A convention is used for lookup join: the
first generator (<CODE>G2</CODE>) is the one traversed, the second
one (<CODE>G1</CODE>) is the table where constants are looked up.
</DIV>
<P><A NAME="keysort/3"><STRONG><CODE>keysort(KeyPos, QH1 [, SortOptions]) -> QH2</CODE></STRONG></A><BR>
<DIV CLASS=REFBODY><P>Types:
<DIV CLASS=REFTYPES>
<P>
<STRONG><CODE>QH1 = QueryHandleOrList</CODE></STRONG><BR>
<STRONG><CODE>QH2 = QueryHandle</CODE></STRONG><BR>
</DIV>
</DIV>
<DIV CLASS=REFBODY>
<P>Returns a query handle. When evaluating the query handle
<CODE>QH2</CODE> the answers to the query handle <CODE>QH1</CODE> are
sorted by <A HREF="file_sorter.html">file_sorter:keysort/4</A>
according to the options.
<P>The sorter will use temporary files only if <CODE>QH1</CODE> does
not evaluate to a list and the size of the binary
representation of the answers exceeds <CODE>Size</CODE> bytes,
where <CODE>Size</CODE> is the value of the <CODE>size</CODE> option.
</DIV>
<P><A NAME="next_answers/2"><STRONG><CODE>next_answers(QueryCursor [, NumberOfAnswers]) ->
Answers | Error</CODE></STRONG></A><BR>
<DIV CLASS=REFBODY><P>Types:
<DIV CLASS=REFTYPES>
<P>
<STRONG><CODE>NumberOfAnswers = all_remaining | int() > 0</CODE></STRONG><BR>
<STRONG><CODE>Error = {error, module(), Reason}</CODE></STRONG><BR>
<STRONG><CODE>Reason =-as returned by file_sorter(3)-</CODE></STRONG><BR>
</DIV>
</DIV>
<DIV CLASS=REFBODY>
<P>Returns some or all of the remaining answers to a query cursor.
Only the owner of <CODE>Cursor</CODE> can retrieve answers.
<P>The optional argument <CODE>NumberOfAnswers</CODE>determines the
maximum number of answers returned. The default value is
<CODE>10</CODE>. If less than the requested number of answers is
returned, subsequent calls to <CODE>next_answers</CODE> will
return <CODE>[]</CODE>.
</DIV>
<P><A NAME="q/2"><STRONG><CODE>q(QueryListComprehension [, Options]) -> QueryHandle</CODE></STRONG></A><BR>
<DIV CLASS=REFBODY><P>Types:
<DIV CLASS=REFTYPES>
<P>
<STRONG><CODE>QueryListComprehension =
-literal query list comprehension-</CODE></STRONG><BR>
<STRONG><CODE>Options = [Option] | Option</CODE></STRONG><BR>
<STRONG><CODE>Option = {max_lookup, MaxLookup}
| {cache, Cache} | cache
| {join, Join}
| {lookup, Lookup}
| {unique, bool()} | unique</CODE></STRONG><BR>
<STRONG><CODE>MaxLookup = int() >= 0 | infinity</CODE></STRONG><BR>
<STRONG><CODE>Join = any | lookup | merge | nested_loop</CODE></STRONG><BR>
<STRONG><CODE>Lookup = bool() | any</CODE></STRONG><BR>
</DIV>
</DIV>
<DIV CLASS=REFBODY>
<P><A NAME="q"><!-- Empty --></A>Returns a query handle for a query list
comprehension. The query list comprehension must be the
first argument to <CODE>qlc:q/1,2</CODE> or it will be evaluated
as an ordinary list comprehension. It is also necessary to
add the line
<PRE>
-include_lib("stdlib/include/qlc.hrl").
</PRE>
<P>to the source file. This causes a parse transform to
substitute a fun for the query list comprehension. The
(compiled) fun will be called when the query handle is
evaluated.
<P>When calling <CODE>qlc:q/1,2</CODE> from the Erlang shell the
parse transform is automatically called. When this happens
the fun substituted for the query list comprehension is not
compiled but will be evaluated by <CODE>erl_eval(3)</CODE>. This
is also true when expressions are evaluated by means of
<CODE>file:eval/1,2</CODE> or in the debugger.
<P>To be very explicit, this will not work:
<PRE>
...
A = [X || {X} <- [{1},{2}]],
QH = qlc:q(A),
...
</PRE>
<P>The variable <CODE>A</CODE> will be bound to the evaluated value
of the list comprehension (<CODE>[1,2]</CODE>). The compiler
complains with an error message ("argument is not a query
list comprehension"); the shell process stops with a
<CODE>badarg</CODE> reason.
<P>The <CODE>{cache,ets}</CODE> option can be used to cache
the answers to a query list comprehension. The answers are
stored in one ETS table for each cached query list
comprehension. When a cached query list comprehension is
evaluated again, answers are fetched from the table without
any further computations. As a consequence, when all answers
to a cached query list comprehension have been found, the
ETS tables used for caching answers to the query list
comprehension's qualifiers can be emptied. The option
<CODE>cache</CODE> is equivalent to <CODE>{cache,ets}</CODE>.
<P>The <CODE>{cache,list}</CODE> option can be used to cache
the answers to a query list comprehension just like
<CODE>{cache,ets}</CODE>. The difference is that the answers
are kept in a list (on the process heap). If the answers
would occupy more than a certain amount of RAM memory a
temporary file is used for storing the answers. The option
<CODE>max_list_size</CODE> sets the limit in bytes and the temporary
file is put on the directory set by the <CODE>tmpdir</CODE> option.
<P>The <CODE>cache</CODE> option has no effect if it is known that
the query list comprehension will be evaluated at most once.
This is always true for the top-most query list
comprehension and also for the list expression of the first
generator in a list of qualifiers. Note that in the presence
of side effects in filters or callback functions the answers
to query list comprehensions can be affected by the
<CODE>cache</CODE> option.
<P>The <CODE>{unique,true}</CODE> option can be used to remove
duplicate answers to a query list comprehension. The unique
answers are stored in one ETS table for each query list
comprehension. The table is emptied every time it is known
that there are no more answers to the query list
comprehension. The option <CODE>unique</CODE> is equivalent to
<CODE>{unique,true}</CODE>. If the <CODE>unique</CODE> option is
combined with the <CODE>{cache,ets}</CODE> option, two ETS
tables are used, but the full answers are stored in one
table only. If the <CODE>unique</CODE> option is combined with the
<CODE>{cache,list}</CODE> option the answers are sorted
twice using <CODE>keysort/3</CODE>; once to remove duplicates, and
once to restore the order.
<P>The <CODE>cache</CODE> and <CODE>unique</CODE> options apply not only
to the query list comprehension itself but also to the
results of looking up constants, running match
specifications, and joining handles.
<PRE>
1> <STRONG>Q = qlc:q([{A,X,Z,W} ||</STRONG>
<STRONG>A <- [a,b,c],</STRONG>
<STRONG>{X,Z} <- [{a,1},{b,4},{c,6}],</STRONG>
<STRONG>{W,Y} <- [{2,a},{3,b},{4,c}],</STRONG>
<STRONG>X =:= Y],</STRONG>
<STRONG>{cache, list}),</STRONG>
<STRONG>io:format("~s~n", [qlc:info(Q)]).</STRONG>
begin
V1 =
qlc:q([P0 ||
P0 = {X,Z} <- qlc:keysort(1, [{a,1},{b,4},{c,6}], [])]),
V2 =
qlc:q([P0 ||
P0 = {W,Y} <- qlc:keysort(2, [{2,a},{3,b},{4,c}], [])]),
V3 =
qlc:q([[G1|G2] ||
G1 <- V1, G2 <- V2, element(1, G1) == element(2, G2)],
[{join,merge},{cache,list}]),
qlc:q([{A,X,Z,W} || A <- [a,b,c], [{X,Z}|{W,Y}] <- V3, X =:= Y])
end
</PRE>
<P>In this example the cached results of the merge join are
traversed for each value of <CODE>A</CODE>. Note that without the
<CODE>cache</CODE> option the join would have been carried out
three times, once for each value of <CODE>A</CODE>
<P><CODE>sort/1,2</CODE> and <CODE>keysort/2,3</CODE> can also be used for
caching answers and for removing duplicates. When sorting
answers are cached in a list, possibly stored on a temporary
file, and no ETS tables are used.
<P>Sometimes (see <A HREF="#lookup_fun">qlc:table/2</A> below) traversal
of tables can be done by looking up key values, which is
assumed to be fast. Under certain (rare) circumstances it
could happen that there are too many key values to look up.
<A NAME="max_lookup"><!-- Empty --></A>The
<CODE>{max_lookup,MaxLookup}</CODE> option can then be used
to limit the number of lookups: if more than
<CODE>MaxLookup</CODE> lookups would be required no lookups are
done but the table traversed instead. The default value is
<CODE>infinity</CODE> which means that there is no limit on the
number of keys to look up.
<PRE>
1> <STRONG>T = gb_trees:empty(),</STRONG>
<STRONG>QH = qlc:q([X || {{X,Y},_} <- gb_table:table(T),</STRONG>
<STRONG>((X =:= 1) or (X =:= 2)),</STRONG>
<STRONG>((Y =:= a) or (Y =:= b) or (Y =:= c))]),</STRONG>
<STRONG>io:format("~s~n", [qlc:info(QH)]).</STRONG>
ets:match_spec_run(
lists:flatmap(fun(K) ->
case
gb_trees:lookup(K,
gb_trees:from_orddict([]))
of
{value,V} ->
[{K,V}];
none ->
[]
end
end,
[{1,a},{1,b},{1,c},{2,a},{2,b},{2,c}]),
ets:match_spec_compile([{{{'$1','$2'},'_'},
[{'andalso',
{'or',
{'=:=','$1',1},
{'=:=','$1',2}},
{'or',
{'or',
{'=:=','$2',a},
{'=:=','$2',b}},
{'=:=','$2',c}}}],
['$1']}]))
</PRE>
<P>In this example using the <CODE>gb_table</CODE> module from the
<A HREF="#implementing_a_qlc_table">Implementing a
QLC table</A> section there are six keys to look up:
<CODE>{1,a}</CODE>, <CODE>{1,b}</CODE>, <CODE>{1,c}</CODE>, <CODE>{2,a}</CODE>,
<CODE>{2,b}</CODE>, and <CODE>{2,c}</CODE>. The reason is that the two
elements of the key {X,Y} are matched separately.
<P>The <CODE>{lookup,true}</CODE> option can be used to ensure
that QLC will look up constants in some QLC table. If there
are more than one QLC table among the generators' list
expressions, constants have to be looked up in at least one
of the tables. The evaluation of the query fails if there
are no constants to look up. This option is useful in
situations when it would be unacceptable to traverse all
objects in some table. Setting the <CODE>lookup</CODE> option to
<CODE>false</CODE> ensures that no constants will be looked up
(<CODE>{max_lookup,0}</CODE> has the same effect). The
default value is <CODE>any</CODE> which means that constants will
be looked up whenever possible.
<P>The <CODE>{join,Join}</CODE> option can be used to ensure
that a certain join method will be used:
<CODE>{join,lookup}</CODE> invokes the lookup join method;
<CODE>{join,merge}</CODE> invokes the merge join method; and
<CODE>{join,nested_loop}</CODE> invokes the method of
matching every pair of objects from two handles. The last
method is mostly very slow. The evaluation of the query
fails if QLC cannot carry out the chosen join method. The
default value is <CODE>any</CODE> which means that some fast join
method will be used if possible.
</DIV>
<P><A NAME="sort/2"><STRONG><CODE>sort(QH1 [, SortOptions]) -> QH2</CODE></STRONG></A><BR>
<DIV CLASS=REFBODY><P>Types:
<DIV CLASS=REFTYPES>
<P>
<STRONG><CODE>QH1 = QueryHandleOrList</CODE></STRONG><BR>
<STRONG><CODE>QH2 = QueryHandle</CODE></STRONG><BR>
</DIV>
</DIV>
<DIV CLASS=REFBODY>
<P>Returns a query handle. When evaluating the query handle
<CODE>QH2</CODE> the answers to the query handle <CODE>QH1</CODE> are
sorted by <A HREF="file_sorter.html"> file_sorter:sort/3</A> according to the options.
<P>The sorter will use temporary files only if <CODE>QH1</CODE> does
not evaluate to a list and the size of the binary
representation of the answers exceeds <CODE>Size</CODE> bytes,
where <CODE>Size</CODE> is the value of the <CODE>size</CODE> option.
</DIV>
<P><A NAME="string_to_handle/3"><STRONG><CODE>string_to_handle(QueryString [, Options [, Bindings]]) ->
QueryHandle | Error</CODE></STRONG></A><BR>
<DIV CLASS=REFBODY><P>Types:
<DIV CLASS=REFTYPES>
<P>
<STRONG><CODE>QueryString = string()</CODE></STRONG><BR>
<STRONG><CODE>Options = [Option] | Option</CODE></STRONG><BR>
<STRONG><CODE>Option = {max_lookup, MaxLookup}
| {cache, Cache} | cache
| {join, Join}
| {lookup, Lookup}
| {unique, bool()} | unique</CODE></STRONG><BR>
<STRONG><CODE>MaxLookup = int() >= 0 | infinity</CODE></STRONG><BR>
<STRONG><CODE>Join = any | lookup | merge | nested_loop</CODE></STRONG><BR>
<STRONG><CODE>Lookup = bool() | any</CODE></STRONG><BR>
<STRONG><CODE>Bindings =-as returned by erl_eval:bindings/1-</CODE></STRONG><BR>
<STRONG><CODE>Error = {error, module(), Reason}</CODE></STRONG><BR>
<STRONG><CODE>Reason = -ErrorInfo as returned by
erl_scan:string/1 or erl_parse:parse_exprs/1-</CODE></STRONG><BR>
</DIV>
</DIV>
<DIV CLASS=REFBODY>
<P>A string version of <CODE>qlc:q/1,2</CODE>. When the query handle
is evaluated the fun created by the parse transform is
interpreted by <CODE>erl_eval(3)</CODE>. The query string is to be
one single query list comprehension terminated by a period.
<PRE>
1> <STRONG>L = [1,2,3],</STRONG>
<STRONG>Bs = erl_eval:add_binding('L', L, erl_eval:new_bindings()),</STRONG>
<STRONG>QH = qlc:string_to_handle("[X+1 || X <- L].", [], Bs),</STRONG>
<STRONG>qlc:eval(QH).</STRONG>
[2,3,4]
</PRE>
<P>This function is probably useful mostly when called from
outside of Erlang, for instance from a driver written in C.
</DIV>
<P><A NAME="table/2"><STRONG><CODE>table(TraverseFun, Options) -> QueryHandle</CODE></STRONG></A><BR>
<DIV CLASS=REFBODY><P>Types:
<DIV CLASS=REFTYPES>
<P>
<STRONG><CODE>TraverseFun = TraverseFun0 | TraverseFun1</CODE></STRONG><BR>
<STRONG><CODE>TraverseFun0 = fun() -> TraverseResult</CODE></STRONG><BR>
<STRONG><CODE>TraverseFun1 = fun(MatchExpression) -> TraverseResult</CODE></STRONG><BR>
<STRONG><CODE>TraverseResult = Objects | term()</CODE></STRONG><BR>
<STRONG><CODE>Objects = [] | [term() | ObjectList]</CODE></STRONG><BR>
<STRONG><CODE>ObjectList = TraverseFun0 | Objects</CODE></STRONG><BR>
<STRONG><CODE>Options = [Option] | Option</CODE></STRONG><BR>
<STRONG><CODE>Option = {format_fun, FormatFun}
| {info_fun, InfoFun}
| {lookup_fun, LookupFun}
| {parent_fun, ParentFun}
| {post_fun, PostFun}
| {pre_fun, PreFun}</CODE></STRONG><BR>
<STRONG><CODE>FormatFun = undefined
| fun(SelectedObjects) -> FormatedTable</CODE></STRONG><BR>
<STRONG><CODE>SelectedObjects = all
| {match_spec, MatchExpression}
| {lookup, {Position, Keys}}</CODE></STRONG><BR>
<STRONG><CODE>FormatedTable = {Mod, Fun, Args}
| AbstractExpression
| character_list()</CODE></STRONG><BR>
<STRONG><CODE>InfoFun = undefined
| fun(InfoTag) -> InfoValue</CODE></STRONG><BR>
<STRONG><CODE>InfoTag = indices
| is_unique_objects
| keypos
| num_of_objects</CODE></STRONG><BR>
<STRONG><CODE>InfoValue = undefined
| term()</CODE></STRONG><BR>
<STRONG><CODE>LookupFun = undefined
| fun(Position, Keys) -> LookupResult</CODE></STRONG><BR>
<STRONG><CODE>LookupResult = [term()] | term()</CODE></STRONG><BR>
<STRONG><CODE>ParentFun = undefined
| fun() -> ParentFunValue</CODE></STRONG><BR>
<STRONG><CODE>PostFun = undefined
| fun() -> void()</CODE></STRONG><BR>
<STRONG><CODE>PreFun = undefined
| fun([PreArg]) -> void()</CODE></STRONG><BR>
<STRONG><CODE>PreArg = {parent_value, ParentFunValue}
| {stop_fun, StopFun}</CODE></STRONG><BR>
<STRONG><CODE>ParentFunValue = undefined
| term()</CODE></STRONG><BR>
<STRONG><CODE>StopFun = undefined
| fun() -> void()</CODE></STRONG><BR>
<STRONG><CODE>Position = int() > 0</CODE></STRONG><BR>
<STRONG><CODE>Keys = [term()]</CODE></STRONG><BR>
<STRONG><CODE>Mod = Fun = atom()</CODE></STRONG><BR>
<STRONG><CODE>Args = [term()]</CODE></STRONG><BR>
</DIV>
</DIV>
<DIV CLASS=REFBODY>
<P><A NAME="table"><!-- Empty --></A>Returns a query handle for a QLC table.
In Erlang/OTP there is support for ETS, Dets and Mnesia
tables, but it is also possible to turn many other data
structures into QLC tables. The way to accomplish this is
to let function(s) in the module implementing the data
structure create a query handle by calling
<CODE>qlc:table/2</CODE>. The different ways to traverse the
table as well as properties of the table are handled by
callback functions provided as options to
<CODE>qlc:table/2</CODE>.
<P>The callback function <CODE>TraverseFun</CODE> is used for
traversing the table. It is to return a list of objects
terminated by either <CODE>[]</CODE> or a nullary fun to be used
for traversing the not yet traversed objects of the table.
Any other return value is immediately returned as value
of the query evaluation.
Unary <CODE>TraverseFun</CODE>s are to accept a match
specification as argument. The match specification is
created by the parse transform by analyzing the pattern of
the generator calling <CODE>qlc:table/2</CODE> and filters using
variables introduced in the pattern. If the parse transform
cannot find a match specification equivalent to the pattern
and filters, <CODE>TraverseFun</CODE> will be called with a match
specification returning every object. Modules that can
utilize match specifications for optimized traversal of
tables should call <CODE>qlc:table/2</CODE> with a unary
<CODE>TraverseFun</CODE> while other modules can provide a
nullary <CODE>TraverseFun</CODE>. <CODE>ets:table/2</CODE> is an
example of the former; <CODE>gb_table:table/1</CODE> in the
<A HREF="#implementing_a_qlc_table">Implementing a
QLC table</A> section is an example of the latter.
<P><CODE>PreFun</CODE> is a unary callback function that is called
once before the table is read for the first time. If the
call fails, the query evaluation fails. Similarly, the
nullary callback function <CODE>PostFun</CODE> is called once
after the table was last read. The return value, which is
caught, is ignored. If <CODE>PreFun</CODE> has been called for a
table, <CODE>PostFun</CODE> is guaranteed to be called for that
table, even if the evaluation of the query fails for some
reason. The order in which pre (post) functions for different
tables are evaluated is not specified. Other table access
than reading, such as calling <CODE>InfoFun</CODE>, is assumed to
be OK at any time. The argument <CODE>PreArgs</CODE> is a list of
tagged values. Currently there are two tags,
<CODE>parent_value</CODE> and <CODE>stop_fun</CODE>, used by Mnesia for
managing transactions. The value of <CODE>parent_value</CODE> is
the value returned by <CODE>ParentFun</CODE>, or <CODE>undefined</CODE>
if there is no <CODE>ParentFun</CODE>. <CODE>ParentFun</CODE> is called
once just before the call of <CODE>PreFun</CODE> in the context
of the process calling <CODE>eval</CODE>, <CODE>fold</CODE>, or
<CODE>cursor</CODE>. The value of <CODE>stop_fun</CODE> is a nullary
fun that deletes the cursor if called from the parent, or
<CODE>undefined</CODE> if there is no cursor.
<P><A NAME="lookup_fun"><!-- Empty --></A>The binary callback function
<CODE>LookupFun</CODE> is used for looking up objects in the
table. The first argument <CODE>Position</CODE> is the key
position or an indexed position and the second argument
<CODE>Keys</CODE> is a sorted list of unique values. The return
value is to be a list of all objects (tuples) such that the
element at <CODE>Position</CODE> is a member of <CODE>Keys</CODE>.
Any other return value is immediately returned as value
of the query evaluation.
<CODE>LookupFun</CODE> is called instead of traversing the table
if the parse transform at compile time can find out that
the filters match and compare the element at
<CODE>Position</CODE> in such a way that only <CODE>Keys</CODE> need to
be looked up in order to find all potential answers. The
key position is obtained by calling <CODE>InfoFun(keypos)</CODE>
and the indexed positions by calling <CODE>InfoFun(indices)</CODE>.
If the key position can be used for lookup it is always
chosen, otherwise the indexed position requiring the least
number of lookups is chosen. If there is a tie between two
indexed positions the one occurring first in the list
returned by <CODE>InfoFun</CODE> is chosen. Positions requiring
more than <A HREF="#max_lookup">max_lookup</A> lookups are
ignored.
<P>The unary callback function <CODE>InfoFun</CODE> is to return
information about the table. <CODE>undefined</CODE> should be
returned if the value of some tag is unknown:
<P>
<UL>
<LI>
<CODE>indices</CODE>. Returns a list of indexed
positions, a list of positive integers.
</LI>
<LI>
<CODE>is_unique_objects</CODE>. Returns <CODE>true</CODE> if
the objects returned by <CODE>TraverseFun</CODE> are unique.
</LI>
<LI>
<CODE>keypos</CODE>. Returns the position of the table's
key, a positive integer.
</LI>
<LI>
<CODE>is_sorted_key</CODE>. Returns <CODE>true</CODE> if
the objects returned by <CODE>TraverseFun</CODE> are sorted
on the key.
</LI>
<LI>
<CODE>num_of_objects</CODE>. Returns the number of
objects in the table, a non-negative integer.
</LI>
</UL>
<P>The unary callback function <CODE>FormatFun</CODE> is used by
<CODE>qlc:info/1,2</CODE> for displaying the call that created
the table's query handle. The default value
<CODE>undefined</CODE> is displayed as a call to
<CODE>'$MOD':'$FUN'/0</CODE>, otherwise it is up to
<CODE>FormatFun</CODE> to present the selected objects in a
suitable way. If a character list is chosen for
presentation it must be an Erlang expression that can be
scanned and parsed (a trailing dot will be added by
<CODE>qlc:info</CODE> though). The argument to <CODE>FormatFun</CODE>
describes the optimizations done as a result of analyzing
the filter(s). The possible values are:
<P>
<UL>
<LI>
<CODE>{lookup, Position, Keys}</CODE>.
<CODE>LookupFun</CODE> is used for looking up objects in the
table.
</LI>
<LI>
<CODE>{match_spec, MatchExpression}</CODE>. No way of
finding all possible answers by looking up keys was
found, but the filters could be transformed into a
match specification. All answers are found by calling
<CODE>TraverseFun(MatchExpression)</CODE>.
</LI>
<LI>
<CODE>all</CODE>. No optimization was found. A match
specification matching all objects will be used if
<CODE>TraverseFun</CODE> is unary.
</LI>
</UL>
<P>See <A HREF="ets.html#qlc_table">ets(3)</A>,
<A HREF="dets.html#qlc_table">dets(3)</A> and
<A HREF="javascript:erlhref('../../../../', 'mnesia', 'mnesia.html#qlc_table');">mnesia(3)</A>
for the various options recognized by <CODE>table/1,2</CODE> in
respective module.
</DIV>
<H3>See Also</H3>
<DIV CLASS=REFBODY>
<P><A HREF="dets.html">dets(3)</A>,
<A HREF="javascript:erlhref('../../../../', 'doc/reference_manual', 'part_frame.html');"> Erlang Reference Manual</A>,
<A HREF="erl_eval.html">erl_eval(3)</A>,
<A HREF="javascript:erlhref('../../../../', 'kernel', 'erlang.html');">erlang(3)</A>,
<A HREF="ets.html">ets(3)</A>,
<A HREF="javascript:erlhref('../../../../', 'kernel', 'file.html');">file(3)</A>,
<A HREF="file_sorter.html">file_sorter(3)</A>,
<A HREF="javascript:erlhref('../../../../', 'mnemosyne', 'mnemosyne.html');">mnemosyne(3)</A>,
<A HREF="javascript:erlhref('../../../../', 'mnesia', 'mnesia.html');">mnesia(3)</A>,
<A HREF="javascript:erlhref('../../../../', 'doc/programming_examples', 'part_frame.html');"> Programming Examples</A>,
<A HREF="shell.html">shell(3)</A>
</DIV>
<H3>AUTHORS</H3>
<DIV CLASS=REFBODY>
Hans Bolinder - support@erlang.ericsson.se<BR>
</DIV>
<CENTER>
<HR>
<SMALL>stdlib 1.14.2<BR>
Copyright © 1991-2006
<A HREF="http://www.erlang.se">Ericsson AB</A><BR>
</SMALL>
</CENTER>
</BODY>
</HTML>
|