1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<!-- This document was generated using DocBuilder 3.3.3 -->
<HTML>
<HEAD>
<TITLE>ms_transform</TITLE>
<SCRIPT type="text/javascript" src="../../../../doc/erlresolvelinks.js">
</SCRIPT>
<STYLE TYPE="text/css">
<!--
.REFBODY { margin-left: 13mm }
.REFTYPES { margin-left: 8mm }
-->
</STYLE>
</HEAD>
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#FF00FF"
ALINK="#FF0000">
<!-- refpage -->
<CENTER>
<A HREF="http://www.erlang.se">
<IMG BORDER=0 ALT="[Ericsson AB]" SRC="min_head.gif">
</A>
<H1>ms_transform</H1>
</CENTER>
<H3>MODULE</H3>
<DIV CLASS=REFBODY>
ms_transform
</DIV>
<H3>MODULE SUMMARY</H3>
<DIV CLASS=REFBODY>
Parse_transform that translates fun syntax into match
specifications.
</DIV>
<H3>DESCRIPTION</H3>
<DIV CLASS=REFBODY>
<A NAME="top"><!-- Empty --></A>
<P> This module implements the parse_transform that makes calls to
<CODE>ets</CODE> and <CODE>dbg</CODE>:<CODE>fun2ms/1</CODE> translate into literal
match specifications. It also implements the back end for the same
functions when called from the Erlang shell.
<P> The translations from fun's to match_specs
is accessed through the two "pseudo
functions" <CODE>ets:fun2ms/1</CODE> and <CODE>dbg:fun2ms/1</CODE>.
<P>Actually this introduction is more or less an introduction to the
whole concept of match specifications. Since everyone trying to use
<CODE>ets:select</CODE> or <CODE>dbg</CODE> seems to end up reading
this page, it seems in good place to explain a little more than
just what this module does.
<P>There are some caveats one should be aware of, please read through
the whole manual page if it's the first time you're using the
transformations.
<P>Match specifications are used more or less as filters.
They resemble usual Erlang matching in a list comprehension or in
a <CODE>fun</CODE> used in conjunction with <CODE>lists:foldl</CODE> etc. The
syntax of pure match specifications is somewhat awkward though, as
they are made up purely by Erlang terms and there is no syntax in the
language to make the match specifications more readable.
<P>As the match specifications execution and structure is quite like
that of a fun, it would for most programmers be more straight forward
to simply write it using the familiar fun syntax and having that
translated into a match specification automatically. Of course a real
fun is more powerful than the match specifications allow, but bearing
the match specifications in mind, and what they can do, it's still
more convenient to write it all as a fun. This module contains the
code that simply translates the fun syntax into match_spec terms.
<P>Let's start with an ets example. Using <CODE>ets:select</CODE> and
a match specification, one can filter out rows of a table and construct
a list of tuples containing relevant parts of the data in these
rows. Of course one could use <CODE>ets:foldl</CODE> instead, but the
select call is far more efficient. Without the translation, one has to
struggle with writing match specifications terms to accommodate this,
or one has to resort to the less powerful
<CODE>ets:match(_object)</CODE> calls, or simply give up and use
the more inefficient method of <CODE>ets:foldl</CODE>. Using the
<CODE>ets:fun2ms</CODE> transformation, a <CODE>ets:select</CODE> call
is at least as easy to write as any of the alternatives.
<P>As an example, consider a simple table of employees:
<PRE>
-record(emp, {empno, %Employee number as a string, the key
surname, %Surname of the employee
givenname, %Given name of employee
dept, %Department one of {dev,sales,prod,adm}
empyear}). %Year the employee was employed
</PRE>
<P>We create the table using:
<PRE>
ets:new(emp_tab,[{keypos,#emp.empno},named_table,ordered_set]).
</PRE>
<P>Let's also fill it with some randomly chosen data for the examples:
<PRE>
[{emp,"011103","Black","Alfred",sales,2000},
{emp,"041231","Doe","John",prod,2001},
{emp,"052341","Smith","John",dev,1997},
{emp,"076324","Smith","Ella",sales,1995},
{emp,"122334","Weston","Anna",prod,2002},
{emp,"535216","Chalker","Samuel",adm,1998},
{emp,"789789","Harrysson","Joe",adm,1996},
{emp,"963721","Scott","Juliana",dev,2003},
{emp,"989891","Brown","Gabriel",prod,1999}]
</PRE>
<P>
Now, the amount of data in the table is of course to small to justify
complicated ets searches, but on real tables, using <CODE>select</CODE> to get
exactly the data you want will increase efficiency remarkably.
<P>Lets say for example that we'd want the employee numbers of
everyone in the sales department. One might use <CODE>ets:match</CODE>
in such a situation:
<PRE>
1> ets:match(emp_tab, {'_', '$1', '_', '_', sales, '_'}).
[["011103"],["076324"]]
</PRE>
<P>Even though <CODE>ets:match</CODE> does not require a full match
specification, but a simpler type, it's still somewhat unreadable, and
one has little control over the returned result, it's always a list of
lists. OK, one might use <CODE>ets:foldl</CODE> or
<CODE>ets:foldr</CODE> instead:
<PRE>
ets:foldr(fun(#emp{empno = E, dept = sales},Acc) -> [E | Acc];
(_,Acc) -> Acc
end,
[],
emp_tab).
</PRE>
<P>Running that would result in <CODE>["011103","076324"]</CODE>
, which at least gets rid of the extra lists. The fun is also quite
straightforward, so the only problem is that all the data from the
table has to be transferred from the table to the calling process for
filtering. That's inefficient compared to the <CODE>ets:match</CODE>
call where the filtering can be done "inside" the emulator and only
the result is transferred to the process. Remember that ets tables are
all about efficiency, if it wasn't for efficiency all of ets could be
implemented in Erlang, as a process receiving requests and sending
answers back. One uses ets because one wants performance, and
therefore one wouldn't want all of the table transferred to the
process for filtering. OK, let's look at a pure
<CODE>ets:select</CODE> call that does what the <CODE>ets:foldr</CODE>
does:
<PRE>
ets:select(emp_tab,[{#emp{empno = '$1', dept = sales, _='_'},[],['$1']}]).
</PRE>
<P>
Even though the record syntax is used, it's still somewhat hard to
read and even harder to write. The first element of the tuple,
<CODE>#emp{empno = '$1', dept = sales, _='_'}</CODE> tells what to
match, elements not matching this will not be returned at all, as in
the <CODE>ets:match</CODE> example. The second element, the empty list
is a list of guard expressions, which we need none, and the third
element is the list of expressions constructing the return value (in
ets this almost always is a list containing one single term). In our
case <CODE>'$1'</CODE> is bound to the employee number in the head
(first element of tuple), and hence it is the employee number that is
returned. The result is <CODE>["011103","076324"]</CODE>, just as in
the <CODE>ets:foldr</CODE> example, but the result is retrieved much
more efficiently in terms of execution speed and memory consumption.
<P>We have one efficient but hardly readable way of doing it and one
inefficient but fairly readable (at least to the skilled Erlang
programmer) way of doing it. With the use of <CODE>ets:fun2ms</CODE>,
one could have something that is as efficient as possible but still is
written as a filter using the fun syntax:
<PRE>
-include_lib("stdlib/include/ms_transform.hrl").
% ...
ets:select(emp_tab, ets:fun2ms(
fun(#emp{empno = E, dept = sales}) ->
E
end)).
</PRE>
<P>This may not be the shortest of the expressions, but it requires no
special knowledge of match specifications to read. The fun's head
should simply match what you want to filter out and the body returns
what you want returned. As long as the fun can be kept within the
limits of the match specifications, there is no need to transfer all
data of the table to the process for filtering as in the
<CODE>ets:foldr</CODE> example. In fact it's even easier to read then
the <CODE>ets:foldr</CODE> example, as the select call in itself
discards anything that doesn't match, while the fun of the
<CODE>foldr</CODE> call needs to handle both the elements matching and
the ones not matching.
<P>It's worth noting in the above <CODE>ets:fun2ms</CODE> example that one
needs to include <CODE>ms_transform.hrl</CODE> in the source code, as this is
what triggers the parse transformation of the <CODE>ets:fun2ms</CODE> call
to a valid match specification. This also implies that the
transformation is done at compile time (except when called from the
shell of course) and therefore will take no resources at all in
runtime. So although you use the more intuitive fun syntax, it gets as
efficient in runtime as writing match specifications by hand.
<P>Let's look at some more <CODE>ets</CODE> examples. Let's say one
wants to get all the employee numbers of any employee hired before the
year 2000. Using <CODE>ets:match</CODE> isn't an alternative here as
relational operators cannot be expressed there. Once again, an
<CODE>ets:foldr</CODE> could do it (slowly, but correct):
<PRE>
ets:foldr(fun(#emp{empno = E, empyear = Y},Acc) when Y < 2000 -> [E | Acc];
(_,Acc) -> Acc
end,
[],
emp_tab).
</PRE>
<P>The result will be
<CODE>["052341","076324","535216","789789","989891"]</CODE>, as
expected. Now the equivalent expression using a handwritten match
specification would look something like this:
<PRE>
ets:select(emp_tab,[{#emp{empno = '$1', empyear = '$2', _='_'},
[{'<', '$2', 2000}],
['$1']}]).
</PRE>
<P>This gives the same result, the <CODE>[{'<', '$2', 2000}]</CODE> is in
the guard part and therefore discards anything that does not have a
empyear (bound to '$2' in the head) less than 2000, just as the guard
in the <CODE>foldl</CODE> example. Lets jump on to writing it using
<CODE>ets:fun2ms</CODE>
<PRE>
-include_lib("stdlib/include/ms_transform.hrl").
% ...
ets:select(emp_tab, ets:fun2ms(
fun(#emp{empno = E, empyear = Y}) when Y < 2000 ->
E
end)).
</PRE>
<P>Obviously readability is gained by using the parse transformation.
<P>I'll show some more examples without the tiresome
comparing-to-alternatives stuff. Let's say we'd want the whole object
matching instead of only one element. We could of course assign a
variable to every part of the record and build it up once again in the
body of the <CODE>fun</CODE>, but it's easier to do like this:
<PRE>
ets:select(emp_tab, ets:fun2ms(
fun(Obj = #emp{empno = E, empyear = Y})
when Y < 2000 ->
Obj
end)).
</PRE>
<P>Just as in ordinary Erlang matching, you can bind a variable to the
whole matched object using a "match in then match", i.e. a
<CODE>=</CODE>. Unfortunately this is not general in <CODE>fun's</CODE> translated
to match specifications, only on the "top level", i.e. matching the
<STRONG>whole</STRONG> object arriving to be matched into a separate variable,
is it allowed. For the one's used to writing match specifications by
hand, I'll have to mention that the variable A will simply be
translated into '$_'. It's not general, but it has very common usage,
why it is handled as a special, but useful, case. If this bothers you,
the pseudo function <CODE>object</CODE> also returns the whole matched
object, see the part about caveats and limitations below.
<P>Let's do something in the <CODE>fun</CODE>'s body too: Let's say
that someone realizes that there are a few people having an employee
number beginning with a zero (<CODE>0</CODE>), which shouldn't be
allowed. All those should have their numbers changed to begin with a
one (<CODE>1</CODE>) instead and one wants the
list <CODE>[{<Old empno>,<New empno>}]</CODE> created:
<PRE>
ets:select(emp_tab, ets:fun2ms(
fun(#emp{empno = [$0 | Rest] }) ->
{[$0|Rest],[$1|Rest]}
end)).
</PRE>
<P>As a matter of fact, this query hit's the feature of partially bound
keys in the table type <CODE>ordered_set</CODE>, so that not the whole
table need be searched, only the part of the table containing keys
beginning with <CODE>0</CODE> is in fact looked into.
<P>The fun of course can have several clauses, so that if one could do
the following: For each employee, if he or she is hired prior to 1997,
return the tuple <CODE>{inventory, <employee number>}</CODE>, for each hired 1997
or later, but before 2001, return <CODE>{rookie, <employee
number>}</CODE>, for all others return <CODE>{newbie, <employee
number>}</CODE>. All except for the ones named <CODE>Smith</CODE> as
they would be affronted by anything other than the tag
<CODE>guru</CODE> and that is also what's returned for their numbers;
<CODE>{guru, <employee number>}</CODE>:
<PRE>
ets:select(emp_tab, ets:fun2ms(
fun(#emp{empno = E, surname = "Smith" }) ->
{guru,E};
(#emp{empno = E, empyear = Y}) when Y < 1997 ->
{inventory, E};
(#emp{empno = E, empyear = Y}) when Y > 2001 ->
{newbie, E};
(#emp{empno = E, empyear = Y}) -> % 1997 -- 2001
{rookie, E}
end)).
</PRE>
<P>The result will be:
<PRE>
[{rookie,"011103"},
{rookie,"041231"},
{guru,"052341"},
{guru,"076324"},
{newbie,"122334"},
{rookie,"535216"},
{inventory,"789789"},
{newbie,"963721"},
{rookie,"989891"}]
</PRE>
<P>and so the Smith's will be happy...
<P>So, what more can you do? Well, the simple answer would be; look
in the documentation of match specifications in ERTS users
guide. However let's briefly go through the most useful "built in
functions" that you can use when the <CODE>fun</CODE> is to be
translated into a match specification by <CODE>ets:fun2ms</CODE> (it's
worth mentioning, although it might be obvious to some, that calling
other functions than the one's allowed in match specifications cannot
be done. No "usual" Erlang code can be executed by the <CODE>fun</CODE> being
translated by <CODE>fun2ms</CODE>, the <CODE>fun</CODE> is after all limited
exactly to the power of the match specifications, which is
unfortunate, but the price one has to pay for the execution speed of
an <CODE>ets:select</CODE> compared to <CODE>ets:foldl/foldr</CODE>).
<P>The head of the <CODE>fun</CODE> is obviously a head matching (or mismatching)
<STRONG>one</STRONG> parameter, one object of the table we <CODE>select</CODE>
from. The object is always a single variable (can be <CODE>_</CODE>) or
a tuple, as that's what's in <CODE>ets, dets</CODE> and
<CODE>mnesia</CODE> tables (the match specification returned by
<CODE>ets:fun2ms</CODE> can of course be used with
<CODE>dets:select</CODE> and <CODE>mnesia:select</CODE> as well as
with <CODE>ets:select</CODE>). The use of <CODE>=</CODE> in the head
is allowed (and encouraged) on the top level.
<P>The guard section can contain any guard expression of Erlang.
Even the "old" type test are allowed on the toplevel of the guard
(<CODE>integer(X)</CODE> instead of <CODE>is_integer(X)</CODE>). As the new type tests (the
<CODE>is_</CODE> tests) are in practice just guard bif's they can also
be called from within the body of the fun, but so they can in ordinary
Erlang code. Also arithmetics is allowed, as well as ordinary guard
bif's. Here's a list of bif's and expressions:
<P>
<UL>
<LI>
The type tests: is_atom, is_constant, is_float, is_integer,
is_list, is_number, is_pid, is_port, is_reference, is_tuple,
is_binary, is_function, is_record
</LI>
<LI>
The boolean operators: not, and, or, andalso, orelse
</LI>
<LI>
The relational operators: >, >=, <, =<, =:=, ==, =/=, /=
</LI>
<LI>
Arithmetics: +, -, *, div, rem
</LI>
<LI>
Bitwise operators: band, bor, bxor, bnot, bsl, bsr
</LI>
<LI>
The guard bif's: abs, element, hd, length, node, round, size, tl,
trunc, self
</LI>
<LI>
The obsolete type test (only in guards):
atom, constant, float, integer,
list, number, pid, port, reference, tuple,
binary, function, record
</LI>
</UL>
<P>Contrary to the fact with "handwritten" match specifications, the
<CODE>is_record</CODE> guard works as in ordinary Erlang code.
<P>Semicolons (<CODE>;</CODE>) in guards are allowed, the result will be (as
expected) one "match_spec-clause" for each semicolon-separated
part of the guard. The semantics beeing identical to the Erlang
semantics.
<P> The body of the <CODE>fun</CODE> is used to construct the
resulting value. When selecting from tables one usually just construct
a suiting term here, using ordinary Erlang term construction, like
tuple parentheses, list brackets and variables matched out in the
head, possibly in conjunction with the occasional constant. Whatever
expressions are allowed in guards are also allowed here, but there are
no special functions except <CODE>object</CODE> and
<CODE>bindings</CODE> (see further down), which returns the whole
matched object and all known variable bindings respectively.
<P>The <CODE>dbg</CODE> variants of match specifications have an
imperative approach to the match specification body, the ets dialect
hasn't. The fun body for <CODE>ets:fun2ms</CODE> returns the result
without side effects, and as matching (<CODE>=</CODE>) in the body of
the match specifications is not allowed (for performance reasons) the
only thing left, more or less, is term construction...
<P>Let's move on to the <CODE>dbg</CODE> dialect, the slightly
different match specifications translated by <CODE>dbg:fun2ms</CODE>.
<P>The same reasons for using the parse transformation applies to
<CODE>dbg</CODE>, maybe even more so as filtering using Erlang code is
simply not a good idea when tracing (except afterwards, if you trace
to file). The concept is similar to that of <CODE>ets:fun2ms</CODE>
except that you usually use it directly from the shell (which can also
be done with <CODE>ets:fun2ms</CODE>).
<P>Let's manufacture a toy module to trace on
<PRE>
-module(toy).
-export([start/1, store/2, retrieve/1]).
start(Args) ->
toy_table = ets:new(toy_table,Args).
store(Key, Value) ->
ets:insert(toy_table,{Key,Value}).
retrieve(Key) ->
[{Key, Value}] = ets:lookup(toy_table,Key),
Value.
</PRE>
<P>During model testing, the first test bails out with a
<CODE>{badmatch,16}</CODE> in <CODE>{toy,start,1}</CODE>, why?
<P>We suspect the ets call, as we match hard on the return value, but
want only the particular <CODE>new</CODE> call with
<CODE>toy_table</CODE> as first parameter.
So we start a default tracer on the node:
<PRE>
1> dbg:tracer().
{ok,<0.88.0>}
</PRE>
<P>And so we turn on call tracing for all processes, we are going to
make a pretty restrictive trace pattern, so there's no need to call
trace only a few processes (it usually isn't):
<PRE>
2> dbg:p(all,call).
{ok,[{matched,nonode@nohost,25}]}
</PRE>
<P>It's time to specify the filter. We want to view calls that resemble
<CODE>ets:new(toy_table,<something>)</CODE>:
<PRE>
3> dbg:tp(ets,new,dbg:fun2ms(fun([toy_table,_]) -> true end)).
{ok,[{matched,nonode@nohost,1},{saved,1}]}
</PRE>
<P>As can be seen, the <CODE>fun</CODE>'s used with
<CODE>dbg:fun2ms</CODE> takes a single list as parameter instead of a
single tuple. The list matches a list of the parameters to the traced
function. A single variable may also be used of course. The body
of the fun expresses in a more imperative way actions to be taken if
the fun head (and the guards) matches. I return <CODE>true</CODE> here, but it's
only because the body of a fun cannot be empty, the return value will
be discarded.
<P>When we run the test of our module now, we get the following trace
output:
<PRE>
(<0.86.0>) call ets:new(toy_table,[ordered_set])
</PRE>
<P>Let's play we haven't spotted the problem yet, and want to see what
<CODE>ets:new</CODE> returns. We do a slightly different trace
pattern:
<PRE>
4> dbg:tp(ets,new,dbg:fun2ms(fun([toy_table,_]) -> return_trace() end)).
</PRE>
<P>Resulting in the following trace output when we run the test:
<PRE>
(<0.86.0>) call ets:new(toy_table,[ordered_set])
(<0.86.0>) returned from ets:new/2 -> 24
</PRE>
<P>The call to <CODE>return_trace</CODE>, makes a trace message appear
when the function returns. It applies only to the specific function call
triggering the match specification (and matching the head/guards of
the match specification). This is the by far the most common call in the
body of a <CODE>dbg</CODE> match specification.
<P>As the test now fails with <CODE>{badmatch,24}</CODE>, it's obvious
that the badmatch is because the atom <CODE>toy_table</CODE> does not
match the number returned for an unnamed table. So we spotted the
problem, the table should be named and the arguments supplied by our
test program does not include <CODE>named_table</CODE>. We rewrite the
start function to:
<PRE>
start(Args) ->
toy_table = ets:new(toy_table,[named_table |Args]).
</PRE>
<P>And with the same tracing turned on, we get the following trace
output:
<PRE>
(<0.86.0>) call ets:new(toy_table,[named_table,ordered_set])
(<0.86.0>) returned from ets:new/2 -> toy_table
</PRE>
<P>Very well. Let's say the module now passes all testing and goes into
the system. After a while someone realizes that the table
<CODE>toy_table</CODE> grows while the system is running and that for some
reason there are a lot of elements with atom's as keys. You had
expected only integer keys and so does the rest of the system. Well,
obviously not all of the system. You turn on call tracing and try to
see calls to your module with an atom as the key:
<PRE>
1> dbg:tracer().
{ok,<0.88.0>}
2> dbg:p(all,call).
{ok,[{matched,nonode@nohost,25}]}
3> dbg:tpl(toy,store,dbg:fun2ms(fun([A,_]) when is_atom(A) -> true end)).
{ok,[{matched,nonode@nohost,1},{saved,1}]}
</PRE>
<P>We use <CODE>dbg:tpl</CODE> here to make sure to catch local calls
(let's say the module has grown since the smaller version and we're
not sure this inserting of atoms is not done locally...). When in
doubt always use local call tracing.
<P>Let's say nothing happens when we trace in this way. Our function
is never called with these parameters. We make the conclusion that
someone else (some other module) is doing it and we realize that we
must trace on ets:insert and want to see the calling function. The
calling function may be retrieved using the match specification
function <CODE>caller</CODE> and to get it into the trace message, one
has to use the match spec function <CODE>message</CODE>. The filter
call looks like this (looking for calls to <CODE>ets:insert</CODE>):
<PRE>
4> dbg:tpl(ets,insert,dbg:fun2ms(fun([toy_table,{A,_}]) when is_atom(A) ->
message(caller())
end)).
{ok,[{matched,nonode@nohost,1},{saved,2}]}
</PRE>
<P>The caller will now appear in the "additional message" part of the
trace output, and so after a while, the following output comes:
<PRE>
(<0.86.0>) call ets:insert(toy_table,{garbage,can}) ({evil_mod,evil_fun,2})
</PRE>
<P>You have found out that the function <CODE>evil_fun</CODE> of the
module <CODE>evil_mod</CODE>, with arity <CODE>2</CODE>, is the one
causing all this trouble.
<P> This was just a toy example, but it illustrated the most used
calls in match specifications for <CODE>dbg</CODE> The other, more
esotherical calls are listed and explained in the <STRONG>Users guide of
the ERTS application</STRONG>, they really are beyond the scope of this
document.
<P>To end this chatty introduction with something more precise, here
follows some parts about caveats and restrictions concerning the fun's
used in conjunction with <CODE>ets:fun2ms</CODE> and
<CODE>dbg:fun2ms</CODE>:
<P>
<TABLE CELLPADDING=4>
<TR>
<TD VALIGN=TOP><IMG ALT="Warning!" SRC="warning.gif"></TD>
<TD>
<P> To use the pseudo functions triggering the translation, one
<STRONG>has to</STRONG> include the header file <CODE>ms_transform.hrl</CODE>
in the source code. Failure to do so will possibly result in
runtime errors rather than compile time, as the expression may
be valid as a plain Erlang program without translation.
</TD>
</TR>
</TABLE>
<P>
<TABLE CELLPADDING=4>
<TR>
<TD VALIGN=TOP><IMG ALT="Warning!" SRC="warning.gif"></TD>
<TD>
<P> The <CODE>fun</CODE> has to be literally constructed inside the
parameter list to the pseudo functions. The <CODE>fun</CODE> cannot
be bound to a variable first and then passed to
<CODE>ets:fun2ms</CODE> or <CODE>dbg:fun2ms</CODE>, i.e this
will work: <CODE>ets:fun2ms(fun(A) -> A end)</CODE> but not this:
<CODE>F = fun(A) -> A end, ets:fun2ms(F)</CODE>. The later will result
in a compile time error if the header is included, otherwise a
runtime error. Even if the later construction would ever
appear to work, it really doesn't, so don't ever use it.
</TD>
</TR>
</TABLE>
<P> Several restrictions apply to the fun that is being translated
into a match_spec. To put it simple you cannot use anything in
the fun that you cannot use in a match_spec. This means that,
among others, the following restrictions apply to the fun itself:
<P>
<UL>
<LI>
Functions written in Erlang cannot be called, neither
local functions, global functions or real fun's
</LI>
<LI>
Everything that is written as a function call will be
translated into a match_spec call to a builtin function, so that
the call <CODE>is_list(X)</CODE> will be translated to <CODE>{'is_list',
'$1'}</CODE> (<CODE>'$1'</CODE> is just an example, the numbering may
vary). If one tries to call a function that is not a match_spec
builtin, it will cause an error.
</LI>
<LI>
Variables occurring in the head of the <CODE>fun</CODE> will be
replaced by match_spec variables in the order of occurrence, so
that the fragment <CODE>fun({A,B,C})</CODE> will be replaced by
<CODE>{'$1', '$2', '$3'}</CODE> etc. Every occurrence of such a
variable later in the match_spec will be replaced by a
match_spec variable in the same way, so that the fun
<CODE>fun({A,B}) when is_atom(A) -> B end</CODE> will be translated into
<CODE>[{{'$1','$2'},[{is_atom,'$1'}],['$2']}]</CODE>.
</LI>
<LI>
Variables that are not appearing in the head are imported
from the environment and made into
match_spec <CODE>const</CODE> expressions. Example from the shell:
<PRE>
1> X = 25.
25
2> ets:fun2ms(fun({A,B}) when A > X -> B end).
[{{'$1','$2'},[{'>','$1',{const,25}}],['$2']}]
</PRE>
</LI>
<LI>
Matching with <CODE>=</CODE> cannot be used in the body. It can only
be used on the top level in the head of the fun.
Example from the shell again:
<PRE>
1> ets:fun2ms(fun({A,[B|C]} = D) when A > B -> D end).
[{{'$1',['$2'|'$3']},[{'>','$1','$2'}],['$_']}]
2> ets:fun2ms(fun({A,[B|C]=D}) when A > B -> D end).
Error: fun with head matching ('=' in head) cannot be translated into
match_spec
{error,transform_error}
3> ets:fun2ms(fun({A,[B|C]}) when A > B -> D = [B|C], D end).
Error: fun with body matching ('=' in body) is illegal as match_spec
{error,transform_error}
</PRE>
All variables are bound in the head of a match_spec, so the
translator can not allow multiple bindings. The special case
when matching is done on the top level makes the variable bind
to <CODE>'$_'</CODE> in the resulting match_spec, it is to allow a more
natural access to the whole matched object. The pseudo
function <CODE>object()</CODE> could be used instead, see below.
The following expressions are translated equally:
<PRE>
ets:fun2ms(fun({a,_} = A) -> A end).
ets:fun2ms(fun({a,_}) -> object() end).
</PRE>
</LI>
<LI>
The special match_spec variables <CODE>'$_'</CODE> and <CODE>'$*'</CODE>
can be accessed through the pseudo functions <CODE>object()</CODE>
(for <CODE>'$_'</CODE>) and <CODE>bindings()</CODE> (for <CODE>'$*'</CODE>).
as an example, one could translate the following
<CODE>ets:match_object/2</CODE> call to a <CODE>ets:select</CODE> call:
<PRE>
ets:match_object(Table, {'$1',test,'$2'}).
</PRE>
...is the same as...
<PRE>
ets:select(Table, ets:fun2ms(fun({A,test,B}) -> object() end)).
</PRE>
(This was just an example, in this simple case the former
expression is probably preferable in terms of readability).
The <CODE>ets:select/2</CODE> call will conceptually look like this
in the resulting code:
<PRE>
ets:select(Table, [{{'$1',test,'$2'},[],['$_']}]).
</PRE>
Matching on the top level of the fun head might feel like a
more natural way to access <CODE>'$_'</CODE>, see above.
</LI>
<LI>
Term constructions/literals are translated as much as is
needed to get them into valid match_specs, so that tuples are
made into match_spec tuple constructions (a one element tuple
containing the tuple) and constant expressions are used when
importing variables from the environment. Records are also
translated into plain tuple constructions, calls to element
etc. The guard test <CODE>is_record/2</CODE> is translated into
match_spec code using the three parameter version that's built
into match_specs, so that <CODE>is_record(A,t)</CODE> is translated
into <CODE>{is_record,'$1',t,5}</CODE> given that the record size of
record type <CODE>t</CODE> is 5.
</LI>
<LI>
Language constructions like <CODE>case</CODE>, <CODE>if</CODE>,
<CODE>catch</CODE> etc that are not present in match_specs are not
allowed.
</LI>
<LI>
If the header file <CODE>ms_transform.hrl</CODE> is not included,
the fun won't be translated, which may result in a
<STRONG>runtime error</STRONG> (depending on if the fun is valid in a
pure Erlang context). Be absolutely sure that the header is
included when using <CODE>ets</CODE> and <CODE>dbg:fun2ms/1</CODE> in
compiled code.
</LI>
<LI>
If the pseudo function triggering the translation is
<CODE>ets:fun2ms/1</CODE>, the fun's head must contain a single
variable or a single tuple. If the pseudo function is
<CODE>dbg:fun2ms/1</CODE> the fun's head must contain a single
variable or a single list.
</LI>
</UL>
<P> The translation from fun's to match_specs is done at compile
time, so runtime performance is not affected by using these pseudo
functions. The compile time might be somewhat longer though.
<P> For more information about match_specs, please read about them
in <STRONG>ERTS users guide</STRONG>.
</DIV>
<H3>EXPORTS</H3>
<P><A NAME="parse_transform/2"><STRONG><CODE>parse_transform(Forms,_Options) -> Forms</CODE></STRONG></A><BR>
<DIV CLASS=REFBODY><P>Types:
<DIV CLASS=REFTYPES>
<P>
<STRONG><CODE>Forms = Erlang abstract code format, see the
erl_parse module description </CODE></STRONG><BR>
<STRONG><CODE>_Options = Option list, required but not used</CODE></STRONG><BR>
</DIV>
</DIV>
<DIV CLASS=REFBODY>
<P>Implements the actual transformation at compile time. This
function is called by the compiler to do the source code
transformation if and when the <CODE>ms_transform.hrl</CODE> header
file is included in your source code. See the <CODE>ets</CODE> and
<CODE>dbg</CODE>:<CODE>fun2ms/1</CODE> function manual pages for
documentation on how to use this parse_transform, see the
<CODE>match_spec</CODE> chapter in <CODE>ERTS</CODE> users guide for a
description of match specifications.
</DIV>
<P><A NAME="transform_from_shell/3"><STRONG><CODE>transform_from_shell(Dialect,Clauses,BoundEnvironment) -> term()</CODE></STRONG></A><BR>
<DIV CLASS=REFBODY><P>Types:
<DIV CLASS=REFTYPES>
<P>
<STRONG><CODE>Dialect = ets | dbg</CODE></STRONG><BR>
<STRONG><CODE>Clauses = Erlang abstract form for a single fun</CODE></STRONG><BR>
<STRONG><CODE>BoundEnvironment = [{atom(), term()}, ...], list of
variable bindings in the shell environment</CODE></STRONG><BR>
</DIV>
</DIV>
<DIV CLASS=REFBODY>
<P>Implements the actual transformation when the <CODE>fun2ms</CODE>
functions are called from the shell. In this case the abstract
form is for one single fun (parsed by the Erlang shell), and
all imported variables should be in the key-value list passed
as <CODE>BoundEnvironment</CODE>. The result is a term, normalized,
i.e. not in abstract format.
</DIV>
<P><A NAME="format_error/1"><STRONG><CODE>format_error(Errcode) -> ErrMessage</CODE></STRONG></A><BR>
<DIV CLASS=REFBODY><P>Types:
<DIV CLASS=REFTYPES>
<P>
<STRONG><CODE>Errcode = term()</CODE></STRONG><BR>
<STRONG><CODE>ErrMessage = string()</CODE></STRONG><BR>
</DIV>
</DIV>
<DIV CLASS=REFBODY>
<P>Takes an error code returned by one of the other functions
in the module and creates a textual description of the
error. Fairly uninteresting function actually.
</DIV>
<H3>AUTHORS</H3>
<DIV CLASS=REFBODY>
Patrik Nyblom - support@erlang.ericsson.se<BR>
</DIV>
<CENTER>
<HR>
<SMALL>stdlib 1.14.2<BR>
Copyright © 1991-2006
<A HREF="http://www.erlang.se">Ericsson AB</A><BR>
</SMALL>
</CENTER>
</BODY>
</HTML>
|