File: PP.pod

package info (click to toggle)
pdl 2.005-4
  • links: PTS
  • area: main
  • in suites: potato
  • size: 4,200 kB
  • ctags: 3,301
  • sloc: perl: 14,876; ansic: 7,223; fortran: 3,417; makefile: 54; sh: 16
file content (1497 lines) | stat: -rw-r--r-- 54,985 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
=head1 NAME

PDL::PP - Generate PDL routines from concise descriptions

=head1 SYNOPSIS

e.g.

	pp_def(
		'sumover',
		Pars => 'a(n); [o]b();',
		Code => 'double tmp=0;
			loop(n) %{ tmp += $a(); %}
			$b() = tmp;
			'
	);

	pp_done();


=head1 DESCRIPTION

In much of what follows we will assume familiarity of the reader with
the concepts of implicit and explicit threading and index manipulations
within PDL. If you have not yet heard of these concepts or are not very
comfortable with them it is time to check L<PDL::Indexing>.

As you may appreciate from its name PDL::PP is a Pre-Processor, i.e.
it expands code via substitutions to make real C-code (well, actually
it outputs XS code (See I<perlxs>) but that is very close to
C).

=head1 Overview

Why do we need PP? Several reasons: firstly, we want to be able to
generate subroutine code for each of the PDL datatypes (PDL_Byte,
PDL_Short,. etc).  AUTOMATICALLY.  Secondly, when referring to slices
of PDL arrays in Perl (e.g. C<$a-E<gt>slice('0:10:2,:')> or other things such
as transposes) it is nice to be able to do this transparently and to
be able to do this 'in-place' - i.e, not to have to make a memory copy
of the section. PP handles all the necessary element and offset
arithmetic for you. There are also the notions of threading (repeated
calling of the same routine for multiple slices, see L<PDL::Indexing>)
and dataflow (see L<PDL::Dataflow>) which use of PP allows.

So how do you use PP? Well for the most part you just write ordinary C
code except for special PP constructs which take the form:

    $something(something else)

or:

   PPfunction %{
     <stuff>
   %}

The most important PP construct is the form C<$array()>. Consider the very
simple PP function to sum the elements of a 1D vector (in fact this is
very similar to the actual code used by 'sumover'):

   pp_def('sumit',
           Pars => 'a(n);  [o]b();',
           Code => '
           	double tmp;
           	tmp = 0;
           	loop(n) %{
            	  tmp += $a();
            	%}
            	$b() = tmp;
   ');

What's going on? The C<Pars =E<gt>> line is very important for PP - it
specifies all the arguments and their dimensionality. We call
this the I<signature> of the PP function (compare also the explanations in
L<PDL::Indexing>).  In this case the
routine takes a 1-D function as input and returns a 0-D scalar as
output.  The C<$a()> PP construct is used to access elements of the array
a(n) for you - PP fills in all the required C code.

[Aside: since PP used C<$var()> for its parsing you must single-quote
all Code=> arguments since you don't want perl to interpolate C<$var()> into
another string - i.e. don't use "" unless you know what
you are doing! Tjl: it's usually easiest to use single quotes and
'something'.$interpolatable.'somethingelse']

In the simple case here where all elements are accessed the PP construct
C<loop(n) %{ ... %}> is used to loop over all elements in dimension C<n>.
Note this feature of PP: ALL DIMENSIONS ARE SPECIFIED BY NAME.

This is made clearer if we avoid the PP loop() construct
and write the loop explicitly using conventional C:

   pp_def('sumit',
           Pars => 'a(n);  [o]b();',
           Code => '
           	int i,n_size;
           	double tmp;
           	n_size = $SIZE(n);
           	tmp = 0;
           	for(i=0; i<n_size; i++) {
            	  tmp += $a(n=>i);
            	}
            	$b() = tmp;
   ');

which does the same as before, except more long-windedly.
You can see to get element C<i> of a() we say C<$a(n=E<gt>i)> - we are
specifying the dimension by name C<n>. In 2D we might say:


   Pars=>'a(m,n);',
      ...
      tmp += $a(m=>i,n=>j);
      ...

The syntax 'm=E<gt>i' borrows from Perl hashes (which are in fact
used in the implementation of PP). One could also say
C<$a(n=E<gt>j,m=E<gt>i)> as order is not important.

You can also see in the above example the use of another PP
construct - C<$SIZE(n)> to get the length of the dimension
C<n>.

It should, however, be noted that you shouldn't write an explicit C-loop
when you could have used the PP C<loop> construct since PDL::PP checks
automatically the loop limits for you, usage of C<loop> makes the code more
concise, etc. But there are certainly situations where you need explicit
control of the loop and now you know how to do it ;).

To revisit 'Why PP?' - the above code for sumit() will be
generated for each data-type. It will operate on slices
of arrays 'in-place'. It will thread automatically - e.g. if
a 2D array is given it will be called repeatedly for each
1D row (again check L<PDL::Indexing> for the details of threading).
And then b() will be a 1D array of sums of each row.
We could call it with $a->xchg(0,1) to sum the colums instead.
And Dataflow tracing etc. will be available.

You can see PP saves the programmer from writing a lot of
needlessly repetitive C-code -- in our opinion this is
one of the best features of PDL making writing
new C subroutines for PDL an amazingly concise exercise. A second reason is
the ability to make PP expand your concise code definitions into different
C code based on the needs of the computer architecture in question. Imagine
for example you are lucky to have a supercomputer at your hands; in that
case you want PDL::PP certainly to generate code that takes advantage of
the vectorising/parallel computing features of your machine (this a project
for the future). In any case, the bottom line is that your unchanged code
should still expand to working XS code even if the internals of PDL
changed.

Also, because you are generating the code in an actual Perl script,
there are many fun things that you can do. Let's say that you need
to write both sumit (as above) and multit. With a little bit of inventivity,
we can do

   for({Name => 'sumit', Init => '0', Op => '+='},
       {Name => 'multit', Init => '1', Op => '*='}) {
	   pp_def($_->{Name},
		   Pars => 'a(n);  [o]b();',
		   Code => '
			double tmp;
			tmp = '.$_->{Init}.';
			loop(n) %{
			  tmp '.$_->{Op}.' $a();
			%}
			$b() = tmp;
	   ');
   }

which defines both the functions easily. Now, if you later need to
change the signature or dimensionality or whatever, you only need
to change one place in your code.
Yeah, sure, your editor does have 'cut and paste' and 'search and replace'
but it's still less bothersome and definitely more difficult to
forget just one place and have strange bugs creep in.
Also, adding 'orit' (bitwise or) later is a one-liner.

And remember, you really have perl's full abilities with you -
you can very easily read any input file and make routines from
the information in that file. For simple cases like the above,
the author (Tjl) currently favors the hash syntax like the above -
it's not too much more characters than the corresponding array
syntax but much easier to understand and change.

We should mention here also the ability to get the pointer to the
beginning of the data in memory - a prerequisite for interfacing
PDL to some libraries. This is handled with the C<$P(var)> directive,
see below.

So, after this quick overview of the general flavour of programming
PDL routines using PDL::PP let's summarise in which circumstances you
should actually use this preprocessor/precompiler. You should use PDL::PP
if you want to

=over 3

=item *

interface PDL to some external library

=item *

write some algorithm that would be slow if coded in perl
(this is not as often as you think; take a look at threading
and dataflow first).

=item *

be a PDL developer (and even then it's not obligatory)

=back

=head1 WARNING

Because of its architecture, PDL::PP can be both flexible and
easy to use (yet exuberantly complicated) at the same time. Currently, part
of the problem is that error messages are not very informative and if
something goes wrong, you'd better know what you are
doing and be able to hack your way through the internals (or be able to
figure out by trial and error what is wrong with your args to C<pp_def>).

An alternative, of course, is to ask someone about it (e.g., through the
mailing lists).

=head1 ABANDON ALL HOPE, YE WHO ENTER HERE (DESCRIPTION)

Now that you have some idea how to use C<pp_def> to define new PDL functions
it is time to explain the general syntax of C<pp_def>.
C<pp_def> takes as arguments first the name of the function
you are defining and then a hash list that can contain various keys.

Based on these keys PP generates XS code and a .pm file. The function
C<pp_done> (see example in the SYNOPSIS) is used to tell PDL::PP that there
are no more definitions in this file and it is time to generate the .xs and
 .pm file.

As a consequence, there may be several pp_def() calls inside a file (by
convention files with PP code have the extension .pd or .pp) but generally
only one pp_done().

There are two main different types of usage of pp_def(),
the 'data operation' and 'slice operation' prototypes.

The 'data operation' is used to take some data, mangle it and
output some other data; this includes for example the '+' operation,
matrix inverse, sumover etc and all the examples we have talked about
in this document so far. Implicit and explicit threading and the creation
of the result are taken care of automatically in those opeartions. You
can even do dataflow with C<sumit>, C<sumover>, etc
(don't be dismayed if you don't understand the concept of dataflow
in PDL very well yet; it is still very much experimental).


The 'slice operation' is a different kind of operation: in a slice
operation, you are not changing any data, you are defining
correspondences between different elements of two piddles (examples include
the index manipulation/slicing function definitions in the file F<slices.pd>
that is part of the PDL distribution; but beware, this is not introductory
level stuff).

If you are just interested in communicating with some external
library (for example some linear algebra/matrix library), you'll usually
want the 'data operation' so we are going to discuss that first.

=head1 Data operation

=head2 A simple example

In the data operation, you must know what dimensions of data
you need. First, an example with scalars:

	pp_def('add',
		Pars => 'a(); b(); [o]c();',
		Code => '$c() = $a() + $b();'
	);

That looks a little strange but let's dissect it. The first
line is easy: we're defining a routine with the name 'add'.
The second line simply declares our parameters and the parentheses
mean that they are scalars. We call the string that defines our parameters
and their dimensionality the I<signature> of that function. For its
relevance with regard to threading and index manipulations check the
L<PDL::Indexing> manpage.

The third line is the actual operation. You need to use the
dollar signs and parentheses to refer to your parameters
(this will probably change at some point in the future, once
a good syntax is found).

These lines are all that is necessary to actually define the function
for PDL (well, actually it isn't; you aditionally need to write a
Makefile.PL (see below) and build the module (something like 'perl
Makefile.PL; make'); but let's ignore that for the moment). So now you
can do

	use MyModule;
	$a = pdl 2,3,4;
	$b = pdl 5;

	$c = add($a,$b);
	# or
	add($a,$b,($c=null)); # Alternative form, useful if $c has been
	                      # preset to something big, not useful here.

and have threading work correctly (the result is $c == [7 8 9]).

=head2 The C<Pars> section : the signature of a PP function

Seeing the above example code you will most probably ask: what is this
strange C<$c=null> syntax in the second call to our new C<add> function? If
you take another look at the definition of C<add> you will notice that
the third argument C<c> is flagged with the qualifier C<[o]> which
tells PDL::PP that this is an output argument. So the above call to
add means 'create a new $c from scratch with correct dimensions' -
C<null> is a special token for 'empty piddle' (you might ask why we
haven't used the value C<undef> to flag this instead of the PDL
specific C<null>; we are currently thinking about it ;).

[This should be explained in some other section of the manual
as well!!]
The reason for having this syntax as an alternative is that if you have
really huge piddles, you can do

	$c = PDL->null;
	for(some long loop) {
		# munge a,b
		add($a,$b,$c);
		# munge c, put something back to a,b
	}

and avoid allocating and deallocating $c each time. It is allocated
once at the first add() and thereafter the memory stays until $c is
destroyed.

If you just say

  $c =  add($a,$b);

the code generated by PP will automatically fill in C<$c=null>
and return
the result. If you want to learn more
about the reasons why PDL::PP supports this style where output arguments
are given as last arguments check the
L<PDL::Indexing> manpage.

C<[o]> is not the only qualifier a pdl argument can have in the signature.
Another important qualifier is the C<[t]> option which flags a pdl as
temporary.  What does that mean? You tell PDL::PP that this pdl is only
used for temporary results in the course of the calculation and you are
not interested in its value after the computation has been completed. But
why should PDL::PP want to know about this in the first place?  The reason
is closely related to the concepts of pdl auto creation (you heard
about that above) and implicit threading. If you use implicit threading
the dimensionality of automatically created pdls is actually larger than
that specified in the signature. With C<[o]> flagged pdls will be created
so that they have the additional dimensions as required by the number
of implicit thread dimensions. When creating a temporary pdl, however,
it will always only be made big enough so that it can hold the result
for one iteration in a threadloop, i.e. as large as required by the signature.
So less memory is wasted when you flag a pdl as temporary. Secondly, you
can use output auto creation with temporary pdls even when you are using
explicit threading which is forbidden for normal output pdls flagged with
C<[o]> (see L<PDL::Indexing>).

Here is an example where we use the [t] qualifier. We define the function
C<callf> that calls a C routine C<f> which needs a temporary array of the
same size and type as the array C<a> (sorry about the forward reference
for C<$P>; it's a pointer access, see below) :

  pp_def('callf',
	Pars => 'a(n); [t] tmp(n); [o] b()',
	Code => 'int ns = $SIZE(n);
		 f($P(a),$P(b),$P(tmp),ns);
		'
  );

=head2 Argument dimensions and the signature

Now we have just talked about dimensions of pdls and the signature. How
are they related? Let's say that we want to add a scalar + the index
number to a vector:

	pp_def('add2',
		Pars => 'a(n); b(); [o]c(n);',
		Code => 'loop(n) %{
				$c() = $a() + $b() + n;
			 %}'
	);

There are several points to notice here: first, the C<Pars>
argument now contains the I<n> arguments to show that we have a single
dimensions in I<a> and I<c>. It is important to note that dimensions
are actual entities that are accessed by name so this declares
I<a> and I<c> to have the B<same> first dimensions. In most PP definitions
the size of named dimensions will be set from the respective dimensions
of non-output pdls (those with no C<[o]> flag) but sometimes you might
want to set the size of a named dimension explicitly through an integer
parameter. See below in the description of the C<OtherPars> section how
that works.

=head2 Type conversions and the signature

The signature also determines the type conversions that will be performed
when a PP function is invoked. So what happens when we invoke one of
our previously defined functions with pdls of different type, e.g.

  add2($a,$b,($ret=null));

where $a is of type C<PDL_Float> and $b of type C<PDL_Short>? With the signature
as shown in the definition of C<add2> above the datatype of the operation
(as determined at runtime) is that of the pdl with the 'highest' type
(sequence is byte < short < ushort < long < float < double). In the add2
example the datatype of the operation is float ($a has that datatype). All
pdl arguments are then type converted to that datatype (they are not
converted inplace but a copy with the right type is created if a pdl argument
doesn't have the type of the operation).
Null pdls don't contribute a type
in the determination of the type of the operation.
However, they will be
created with the datatype of the operation; here, for example, $ret will be
of type float. You should be aware of these rules when calling PP functions
with pdls of different types to take the additional storage and runtime
requirements into account.

These type conversions are correct for most functions you normally define
with C<pp_def>. However, there are certain cases where slightly modified
type conversion behaviour is desired. For these cases additional qualifiers
in the signature can be used to specify the desired properties with regard
to type conversion. These qualifiers can be combined with those we have
encountered already (the I<creation qualifiers> C<[o]> and C<[t]>). Let's
go through the list of qualifiers that change type conversion behaviour.

The most important is the C<int> qualifier which comes in handy when a
pdl argument represents indices into another pdl. Let's take a look at
an example from C<PDL::Primitive>:

   pp_def('maximum_ind',
	  Pars => 'a(n); int [o] b()',
	  Code => '$GENERIC() cur;
		   int curind;
		   loop(n) %{
		    if (!n || $a() > cur) {cur = $a(); curind = n;}
	 	   %}
	 	   $b() = curind;',
   );

The function C<maximum_ind> finds the index of the largest element of
a vector. If you look at the signature you notice that the output
argument C<b> has been declared with the additional C<int> qualifier.
This has the following consequences for type conversions: regardless of
the type of the input pdl C<a> the output pdl C<b> will be of type
C<PDL_Long> which makes sense since C<b> will represent an index into
C<a>. Furthermore, if you call the function with an existing output
pdl C<b> its type will not influence the datatype of the operation (see
above). Hence, even if C<a> is of a smaller type than C<b> it will not
be converted to match the type of C<b> but stays untouched, which saves
memory and CPU cycles and is the right thing to do when C<b> represents
indices. Also note that you can use the 'int' qualifier together with
other qualifiers (the C<[o]> and C<[t]> qualifiers). Order is significant --
type qualifiers precede creation qualifiers (C<[o]> and C<[t]>).

The above example also demonstrates typical usage of the C<$GENERIC()>
macro.  It expands to the current type in a so called generic
loop. What is a generic loop? As you already heard a PP function has a
runtime datatype as determined by the type of the pdl arguments it has
been invoked with.  The PP generated XS code for this function
therefore contains a switch like C<switch (type) {case PDL_Byte: ... case
PDL_Double: ...}> that selects a case based on the runtime
datatype of the function (it's called a type ``loop''
because there is a loop in PP code that generates the cases).
In any case your code is inserted once for each PDL type
into this switch statement. The C<$GENERIC()> macro just expands to
the respective type in each copy of your parsed code in this C<switch>
statement, e.g., in the C<case PDL_Byte> section C<cur> will expand to
C<PDL_Byte> and so on for the other case statements. I guess you
realise that this is a useful macro to hold values of pdls in some
code.

There are a couple of other qualifiers with similar effects as C<int>.
For your convenience there are the C<float> and C<double> qualifiers
with analogous consequences on type conversions as C<int>. Let's
assume you have a I<very> large array for which you want to compute
row and column sums with an equivalent of the C<sumover> function.
However, with the normal definition of C<sumover> you might run
into problems when your data is, e.g. of type short. A call like

  sumover($large_pdl,($sums = null));

will result in C<$sums> be of type short and is therefore prone to
overflow errors if C<$large_pdl> is a very large array. On the other
hand calling

  @dims = $large_pdl->dims; shift @dims;
  sumover($large_pdl,($sums = zeroes(double,@dims)));

is not a good alternative either. Now we don't have overflow problems with
C<$sums> but at the expense of a type conversion of C<$large_pdl> to
double, something bad if this is really a large pdl. That's where C<double>
comes in handy:

  pp_def('sumoverd',
	 Pars => 'a(n); double [o] b()',
	 Code => 'double tmp=0;
		  loop(n) %{ tmp += a(); %}
		  $b() = tmp;',
  );

This gets us around the type conversion and overflow problems. Again,
analogous to the C<int> qualifier C<double> results in C<b> always being of
type double regardless of the type of C<a> without leading to a
typeconversion of C<a> as a side effect.

Finally, there are the C<type+> qualifiers where type is one of C<int>
or C<float>. What shall that mean. Let's illustrate the C<int+>
qualifier with the actual definition of sumover:

  pp_def('sumover',
	 Pars => 'a(n); int+ [o] b()',
	 Code => '$GENERIC(b) tmp=0;
		  loop(n) %{ tmp += a(); %}
		  $b() = tmp;',
  );

As we had already seen for the C<int>, C<float> and C<double>
qualifiers, a pdl marked with a C<type+> qualifier does not influence
the datatype of the pdl operation. Its meaning is "make this pdl at
least of type C<type> or higher, as required by the type of the
operation". In the sumover example this means that when you call the
function with an C<a> of type PDL_Short the output pdl will be of type
PDL_Long (just as would have been the case with the C<int>
qualifier). This again tries to avoid overflow problems when using
small datatypes (e.g. byte images).  However, when the datatype of the
operation is higher than the type specified in the C<type+> qualifier
C<b> will be created with the datatype of the operation, e.g. when
C<a> is of type double then C<b> will be double as well. We hope you
agree that this is sensible behaviour for C<sumover>. It should be
obvious how the C<float+> qualifier works by analogy.
It may become necessary to be able to specify a set of alternative
types for the parameters. However, this will probably not be
implemented until someone comes up with a reasonable use for it.

Note that we now had to specify the C<$GENERIC> macro with the name
of the pdl to derive the type from that argument. Why is that? If you
carefully followed our explanations you will have realised that in some
cases C<b> will have a different type than the type of the operation.
Calling the '$GENERIC' macro with C<b> as argument makes sure that
the type will always the same as that of C<b> in that part of the
generic loop.

This is about all there is to say about the C<Pars> section in a
C<pp_def> call. You should remember that this section defines the I<signature>
of a PP defined function, you can use several options to qualify certain
arguments as output and temporary args and all dimensions that you can
later refer to in the C<Code> section are defined by name.

It is important that you understand the meaning of the signature since
in the latest PDL versions you can use it to define threaded functions
from within perl, i.e. what we call I<perl level threading>. Please check
L<PDL::Indexing> for details.

=head2 The C<Code> section

The C<Code> section contains the actual XS code that will be in the
innermost part of a threadloop (if you don't know what a thread loop is then
you still haven't read L<PDL::Indexing>; do it now ;) after any PP macros
(like C<$GENERIC>) and PP functions have been expanded (like the
C<loop> function we are going to explain next).

Let's quickly reiterate the C<sumover> example:

  pp_def('sumover',
	 Pars => 'a(n); int+ [o] b()',
	 Code => '$GENERIC(b) tmp=0;
		  loop(n) %{ tmp += a(); %}
		  $b() = tmp;',
  );



The C<loop> construct in the C<Code> section also refers to the
dimension name so you don't need to specify any limits: the loop is
correctly sized and everything is done for you, again.

Next, there is the surprising fact that C<$a()> and C<$b()> do B<not>
contain the index. This is not necessary because we're looping over
I<n> and both variables know which dimensions they have so
they automatically know they're being looped over.

This feature comes in very handy in many places and makes for
much shorter code. Of course, there are times when you want to
circumvent this; here is a function which symmetrizes a matrix
and serves as an example of how to code explicit looping:

	pp_def('symm',
		Pars => 'a(n,n); [o]c(n,n);',
		Code => 'loop(n) %{
				int n2;
				for(n2=n; n2<$SIZE(n); n2++) {
					$c(n0 => n, n1 => n2) =
					$c(n0 => n2, n1 => n) =
					 $a(n0 => n, n1 => n2);
				}
			%}
		'
	);

Let's dissect what is happening. Firstly, what is this function supposed to
do? From its signature you see that it takes a 2D matrix with equal numbers
of columns and rows and outputs a matrix of the same size. From a given
input matrix $a it computes a symmetric output matrix $c (symmetric in
the matrix sense that A^T = A where ^T means matrix transpose, or in PDL
parlance $c == $c->xchg(0,1)). It does this by using only the values
on and below the diagonal of $a. In the output matrix $c all values on
and below the diagonal are the same as those in $a while those above the
diagonal are a mirror image of those below the diagonal (above and below
are here interpreted in the way that PDL prints 2D pdls). If this explanation
still sounds a bit strange just go ahead, make a little file into which you
write this definition, build the new PDL extension (see section on Makefiles
for PP code) and try it out with a couple of examples.

Having explained what the function is supposed to do there are a
couple of points worth noting from the syntactical point of
view. First, we get the size of the dimension named C<n> again by
using the C<$SIZE> macro. Second, there are suddenly these funny C<n0>
and C<n1> index names in the code though the signature defines only
the dimension C<n>. Why this? The reason becomes clear when you note
that both the first and second dimension of $a and $b are named C<n>
in the signature of C<symm>. This tells PDL::PP that the first and
second dimension of these arguments should have the same
size. Otherwise the generated function will raise a runtime error.
However, now in an access to C<$a> and C<$c> PDL::PP cannot figure out
which index C<n> refers to any more just from the name of the index.
Therefore, the indices with equal dimension names get numbered from
left to right starting at 0, e.g. in the above example C<n0> refers to
the first dimension of C<$a> and C<$c>, C<n1> to the second and so on.

In all examples so far, we have only used the C<Pars> and C<Code>
members of the hash that was passed to C<pp_def>. There are certainly
other keys that are recognised by PDL::PP and we will hear about some
of them in the course of this document. Find a (non-exhaustive) list
of keys in Appendix A.  A list of macros and PPfunctions (we have only
encountered some of those in the examples above yet) that are expanded
in values of the hash argument to C<pp_def> is summarised in Appendix
B.

At this point, it might be appropriate to mention that
PDL::PP is not a completely static, well designed set of routines (as
Tuomas puts it: "stop thinking of PP as a set of routines carved in
stone") but rather a collection of things that the PDL::PP author
(Tuomas J. Lukka) considered he would have to write often into his PDL
extension routines. PP tries to be expandable so that in the future,
as new needs arise, new common code can be abstracted back into it. If
you want to learn more on why you might want to change PDL::PP and how
to do it check the section on PDL::PP internals.


=head2 Interfacing your own/library functions using PP

Now, consider the following: you have your own C function
(that may in fact be part of some library you want to interface to PDL)
which takes as arguments two pointers to vectors of double:

	void myfunc(int n,double *v1,double *v2);

The correct way of defining the PDL function is

	pp_def('myfunc',
		Pars => 'a(n); [o]b(n);',
		GenericTypes => [D],
		Code => 'myfunc($SIZE(n),$P(a),$P(b));'
	);

The C<$P(>I<par>C<)> syntax returns a pointer to the first
element and the other elements are guaranteed to lie after that.

Notice that here it is possible to make many mistakes. First,
C<$SIZE(n)> must be used instead of C<n>. Second, you shouldn't put
any loops in this code. Third, here we encounter a new hash key
recognised by PDL::PP : the C<GenericTypes> declaration tells PDL::PP
to ONLY GENERATE THE TYPELOOP FOP THE LIST OF TYPES SPECIFIED. In
this case C<double>. This has two advantages. Firstly the size of
the compiled code is reduced vastly, secondly if non-double arguments
are passed to C<myfunc()> PDL will automatically convert them to
double before passing to the external C routine and convert them
back afterwards.

One can also use C<Pars> to qualify the types of individual
arguments. Thus one could also write this as:

	pp_def('myfunc',
		Pars => 'double a(n); double [o]b(n);',
		Code => 'myfunc($SIZE(n),$P(a),$P(b));'
	);

The type specification in C<Pars> exempts the argument from
variation in the typeloop - rather it is automatically converted
too and from the type specified. This is obviously useful in
a more general example, e.g.:

	void myfunc(int n,float *v1,long *v2);

	pp_def('myfunc',
		Pars => 'float a(n); long [o]b(n);',
		GenericTypes => [F],
		Code => 'myfunc($SIZE(n),$P(a),$P(b));'
	);

Note we still use C<GenericTypes> to reduce the size of the
type loop, obviously PP could in principle spot this and do
it automatically though the code has yet to attain that
level of sophistication!

Finally note when types are converted automatically one MUST
use the C<[o]> qualifier for output variables or you hard
one changes will get optimised away by PP!


If you interface a large library you can automate the interfacing even
further. Perl can help you again(!) in doing this. In many libraries
you have certain calling conventions. This can be exploited. In short,
you can write a little parser (which is really not difficult in perl) that
then generates the calls to C<pp_def> from parsed descriptions of the
functions in that library. For an example, please check the I<Slatec>
interface in the C<Lib> tree of the PDL distribution. If you want to check
(during debugging) which calls to PP functions your perl code generated
a little helper package comes in handy which replaces the PP functions
by identically named ones that dump their arguments to stdout.

Just say

   perl -MPDL::PP::Dump myfile.pd

to see the calls to C<pp_def> and friends. Try it with F<ops.pd> and
F<slatec.pd>. If you're interested (or want to enhance it), the source
is in Basic/Gen/PP/Dump.pm

=head2 Other macros and functions in the Code section

Macros: So far we have encountered the C<$SIZE>, C<$GENERIC> and C<$P> macros.
Now we are going to quickly explain the other macros that are expanded in the
C<Code> section of PDL::PP along with examples of their usage.

=over 3

=item C<$T>

The C<$T> macro is used for type switches. This is very useful when you have
to use different external (e.g. library) functions depending on the input
type of arguments. The general syntax is

	$Ttypeletters(type_alternatives)

where C<typeletters> is a permutation of a subset of the letters
C<BSULFD> which stand for Byte, Short, Ushort, etc. and
C<type_alternatives> are the expansions when the type of the PP
operation is equal to that indicated by the respective letter. Let's
illustrate this incomprehensible description by an example. Assuming
you have two C functions with prototypes

  void float_func(float *in, float *out);
  void double_func(double *in, double *out);

which do basically the same thing but one accepts float and the other
double pointers. You could interface them to PDL by defining a generic
function C<foofunc> (which will call the correct function depending
on the type of the transformation):

  pp_def('foofunc',
	Pars => ' a(n); [o] b();',
	Code => ' $TFD(float_func,double_func) ($P(a),$P(b));'
	GenericTypes => [F,D],
  );

Please note that you can't say

       Code => ' $TFD(float,double)_func ($P(a),$P(b));'

since the C<$T> macro expands with trailing spaces, analogously to
C preprocessor macros.
The slightly longer form illustrated above is correct.
If you really want brevity, you can of course do

	'$TBSULFD('.(join ',',map {"long_identifier_name_$_"}
		qw/byt short unseigned lounge flotte dubble/).');'

=item C<$PP>

The C<$PP> macro is used for a so called I<physical pointer access>. The
I<physical> refers to some internal optimisations of PDL (for those who
are familiar with the PDL core we are talking about the vaffine
optimisations). This macro is mainly for internal use and you shouldn't
need to use it in any of your normal code.

=item C<$COMP> (and the C<OtherPars> section)

The C<$COMP> macro is used to access non-pdl values in the code section. Its
name is derived from the implementation of transformations in PDL. The
variables you can refer to using C<$COMP> are members
of the ``compiled'' structure that represents the PDL transformation in question
but does not yet contain any information about dimensions
(for further details check L<PDL::Internals>). However, you can treat
C<$COMP> just as a black box without knowing anything about the
implementation of transformations in PDL. So when would you use this
macro? Its main usage is to access values of arguments that are
declared in the C<OtherPars> section of a C<pp_def> definition. But
then you haven't heard about the C<OtherPars> key yet?!  Let's have
another example that illustrates typical usage of both new features:

  pp_def('pnmout',
	Pars => 'a(m)',
	OtherPars => "char* fd",
	GenericTypes => [B,U,S,L],
	Code => 'PerlIO *fp;
		 IO *io;

               io = GvIO(gv_fetchpv($COMP(fd),FALSE,SVt_PVIO));
		 if (!io || !(fp = IoIFP(io)))
			croak("Can\'t figure out FP");

		 if (PerlIO_write(fp,$P(a),len) != len)
				croak("Error writing pnm file");
  ');

This function is used to write data from a pdl to a file. The file descriptor
is passed as a string into this function. This parameter does not go into
the C<Pars> section since it cannot be usefully treated like a pdl but rather
into the aptly named C<OtherPars> section. Parameters in the C<OtherPars>
section follow those in the C<Pars> section when invoking the function, i.e.

   open FILE,">out.dat" or die "couldn't open out.dat";
   pnmout($pdl,'FILE');

When you want to access this parameter inside the code section you
have to tell PP by using the C<$COMP> macro, i.e. you write
C<$COMP(fd)> as in the example. Otherwise PP wouldn't know that the
C<fd> you are referring to is the same as that specified in the
C<OtherPars> section.

Another use for the C<OtherPars> section is to set a named dimension
in the signature. Let's have an example how that is done:

  pp_def('setdim',
	Pars => '[o] a(n)',
	OtherPars => 'int ns => n',
	Code => 'loop(n) %{ $a() = n; %}',
  );

This says that the named dimension C<n> will be initialised from the
value of the I<other parameter> C<ns> which is of integer type (I guess
you have realised that we use the C<CType From =E<gt> named_dim> syntax).
Now you can call this function in the usual way:

  setdim(($a=null),5);
  print $a;
    [ 0 1 2 3 4 ]

Admittedly this function is not very useful but it demonstrates how it
works. If you call the function with an existing pdl and you don't need
to explicitly specify the size of C<n> since PDL::PP can figure it out
from the dimensions of the non-null pdl. In that case you just give the
dimension parameter as C<-1>:

  $a = hist($b);
  setdim($a,-1);

That should do it.

=back

The only PP function that we have used in the examples so far is C<loop>.
Additionally, there are currently two other functions which are recognised
in the C<Code> section:

=over 2

=item threadloop

As we heard above the signature of a PP defined function defines the
dimensions of all the pdl arguments involved in a I<primitive> operation.
However, you often call the functions that you defined with PP with pdls
that have more dimensions than those specified in the signature. In this
case the primitive operation is performed on all subslices of appropriate
dimensionality in what is called a I<threadloop> (see also overview above
and L<PDL::Indexing>). Assuming you have some notion of this concept you
will probably appreciate that the operation specified in the code section
should be optimised since this is the tightest loop inside a threadloop.
However, if you revisit the example where we define the C<pnmout> function,
you will quickly realise that looking up the C<IO> file descriptor
in the inner threadloop is not very efficient when writing a pdl with
many rows. A better approach would be to look up the C<IO> descriptor
once outside the threadloop and use its value then inside the tightest
threadloop. This is exactly where the C<threadloop> function comes in
handy. Here is an improved definition of C<pnmout> which uses this
function:

  pp_def('pnmout',
	Pars => 'a(m)',
	OtherPars => "char* fd",
	GenericTypes => [B,U,S,L],
	Code => 'PerlIO *fp;
		 IO *io;
		 int len;

               io = GvIO(gv_fetchpv($COMP(fd),FALSE,SVt_PVIO));
		 if (!io || !(fp = IoIFP(io)))
			croak("Can\'t figure out FP");

		 len = $SIZE(m) * sizeof($GENERIC());

		 threadloop %{
		    if (PerlIO_write(fp,$P(a),len) != len)
				croak("Error writing pnm file");
		 %}
  ');

This works as follows. Normally the C code you write inside the
C<Code> section is placed inside a threadloop (i.e., PP generates the
appropriate wrapping XS code around it). However, when you explicitly
use the C<threadloop> function, PDL::PP recognises this and doesn't
wrap your code with an additional threadloop. This has the effect that
code you write outside the threadloop is only executed once per
transformation and just the code with in the surrounding C<%{ ... %}>
pair is placed within the tightest threadloop. This also comes in
handy when you want to perform a decision (or any other code,
especially CPU intensive code) only once per thread, i.e.

  pp_addhdr('
    #define RAW 0
    #define ASCII 1
  ');
  pp_def('do_raworascii',
	 Pars => 'a(); b(); [o]c()',
	 OtherPars => 'int mode',
       Code => ' switch ($COMP(mode)) {
		    case RAW:
			threadloop %{
                            /* do raw stuff */
                        %}
		        break;
		    case ASCII:
			threadloop %{
                            /* do ASCII stuff */
                        %}
		        break;
		    default:
			croak("unknown mode");
		   }'
   );


=item types

The types function works similar to the C<$T> macro. However, with the
C<types> function the code in the following block (delimited by C<%{>
and C<%}> as usual) is executed for all those cases in which the datatype
of the operation is I<any of> the types represented by the letters in the
argument to C<type>, e.g.

     Code => '...

	     types(BSUL) %{
		 /* do integer type operation */
             %}
	     types(FD) %{
		 /* do floating point operation */
	     %}
             ...'

=back

=head2 Other useful PP keys in data operation definitions

You have already heard about the C<OtherPars> key. Currently, there are not
many other keys for a data operation that will be useful in normal (whatever
that is) PP programming. In fact, it would be interesting to hear about
a case where you think you need more than what is provided at the moment.
Please speak up on one of the PDL mailing lists. Most other keys recognised
by C<pp_def> are only really useful for what we call I<slice operations>
(see also above).

One thing that is strongly being planned is variable number
of arguments, which will be a little tricky.

=head2 Other PDL::PP functions to support concise package definition

So far, we have described the C<pp_def> and C<pp_done> functions. PDL::PP
exports a few other functions to aid you in writing concise PDL extension
package definitions.

=begin html

=head3 pp_addhdr

=end html

Often when you interface library functions as in the above example
you have to include additional C include files. Since the XS file is
generated by PP we need some means to make PP insert the appropriate
include directives in the right place into the generated XS file.
To this end there is the C<pp_addhdr> function. This is also the function
to use when you want to define some C functions for internal use by some
of the XS functions (which are mostly functions defined by C<pp_def>).
By including these functions here you make sure that PDL::PP inserts your
code before the point where the actual XS module section begins and will
therefore be left untouched by xsubpp (cf. I<perlxs> and I<perlxstut>
manpages).

A typical call would be

  pp_addhdr('
  #include <unistd.h>       /* we need defs of XXXX */
  #include "libprotos.h"    /* prototypes of library functions */
  #include "mylocaldecs.h"  /* Local decs */

  static void do_the real_work(PDL_Byte * in, PDL_Byte * out, int n)
  {
	/* do some calculations with the data */
  }
  ');

This ensures that all the constants and prototypes you need will be properly
included and that you can use the internal functions defined here in the
C<pp_def>s, e.g.:

  pp_def('barfoo',
	 Pars => ' a(n); [o] b(n)',
	 GenericTypes => '[B]',
	 Code => ' int ns = $SIZE(n);
		   do_the_real_work($P(a),$P(b),ns);
                 ',
  );

=begin html

=head3 pp_addpm

=end html

In many cases the actual PP code (meaning the arguments to C<pp_def>
calls) is only part of the package you are currently
implementing. Often there is additional perl code and XS code
you would normally have written into the pm and XS files which are now
automatically generated by PP. So how to get this stuff into those
dynamically generated files? Fortunately, there are a couple of
functions, generally called C<pp_addXXX> that assist you in doing
this.

Let's assume you have additional perl code that should go into the
generated B<pm>-file. This is easily achieved with the C<pp_addpm> command:

   pp_addpm(<<'EOD');

   =head1 NAME

   PDL::Lib::Mylib -- a PDL interface to the Mylib library

   =head1 DESCRIPTION

   This package implements an interface to the Mylib package with full
   threading and indexing support (see L<PDL::Indexing>).

   =cut

   use PGPLOT;

   =head2 use_myfunc
	this function applies the myfunc operation to all the
	elements of the input pdl regardless of dimensions
	and returns the sum of the result
   =cut

   sub use_myfunc {
	my $pdl = shift;

	myfunc($pdl->clump(-1),($res=null));

	return $res->sum;
   }

   EOD

=begin html

=head3 pp_add_exported

=end html

You have probably got the idea. In some cases you also want to export
your additional functions. To avoid getting into trouble with PP which
also messes around with the C<@EXPORT> array you just tell PP to add
your functions to the list of exported functions:

  pp_add_exported('', 'use_myfunc gethynx');

Note the initial empty string argument (reason for it?).

=begin html

=head3 pp_add_isa

=end html

The C<pp_add_isa> command works like the the C<pp_add_exported> function. 
The arguments to C<pp_add_isa> are added the @ISA list, e.g.

  pp_add_isa(' Some::Other::Class ');

=begin html

=head3 pp_addxs

=end html

Sometimes you want to add extra XS code of your own
(that is generally not involved with any threading/indexing issues
but supplies some other functionality you want to access from the perl
side) to the generated XS file, for example

  pp_addxs('','

  # Determine endianness of machine

  int
  isbigendian()
     CODE:
       unsigned short i;
       PDL_Byte *b;

       i = 42; b = (PDL_Byte*) (void*) &i;

       if (*b == 42)
          RETVAL = 0;
       else if (*(b+1) == 42)
          RETVAL = 1;
       else
          croak("Impossible - machine is neither big nor little endian!!\n");
       OUTPUT:
         RETVAL
  ');

Especially C<pp_add_exported> and C<pp_addxs> should be used with care. PP uses
PDL::Exporter, hence letting PP export your function means that they get added
to the standard list of function exported by default (the list defined by the
export tag ``:Func''). If you use C<pp_addxs> you shouldn't try to do anything
that involves threading or indexing directly. PP is much better at generating
the appropriate code from your definitions.

=begin html

=head3 pp_add_boot

=end html

Finally, you may want to add some code to the BOOT section of the XS file
(if you don't know what that is check I<perlxs>). This is easily done
with the C<pp_add_boot> command:

  pp_add_boot(<<EOB);
	descrip = mylib_initialize(KEEP_OPEN);

	if (descrip == NULL)
	   croak("Can't initialize library");

	GlobalStruc->descrip = descrip;
	GlobalStruc->maxfiles = 200;
  EOB

=begin html

=head3 pp_export_nothing

=end html

By default, PP.pm puts all subs defined using the pp_def function into the output .pm
file's EXPORT list. This can create problems if you are creating a subclassed
object where you don't want any methods exported. (i.e. the methods will only
be called using the $object->method syntax).

For these cases you can call pp_export_nothing() to clear out the export list. Example (At 
the end of the .pd file):
	

  pp_export_nothing();
  pp_done();

=begin html

=head3 pp_core_importList

=end html

By default, PP.pm puts the 'use Core;' line into the output .pm file. This imports Core's
exported names into the current namespace, which can create 
problems if you are over-riding one of Core's methods in the current file.
You end up getting messages like "Warning: sub sumover redefined in file
subclass.pm" when running the program.

For these cases the pp_core_importList can be used to change what is imported from Core.pm. 
For example: 
      

  pp_core_importList('()') 
      
This would result in 

  use Core();

being generated in the output .pm file. This would result in no names being imported
from Core.pm. Similarly, calling 

  pp_core_importList(' qw/ barf /')
       
would result in
 
  use Core qw/ barf/;
        
being generated in the output .pm file. This would result in just 'barf'
being imported from Core.pm.


=head2 Slice operation

The slice operation section of this manual is provided using
dataflow and lazy evaluation: when you need it, ask Tjl to write it.
a delivery in a week from when I receive the email is 95% probable and
two week delivery is 99% probable.

And anyway, the slice operations require a much more intimate knowledge
of PDL internals than the data operations. Furthermore, the complexity
of the issues involved is considerably higher than that in the average
data operation. If you would like to convince yourself of this fact
take a look at the C<Basic/Slices/slices.pd> file in the PDL
distribution :-). Nevertheless,
functions generated using the slice operations are at the heart of the
index manipulation and dataflow capabilities of PDL.

Also, there are a lot of dirty issues with virtual piddles and
vaffines which we shall entirely skip here.

=head2 Makefiles for PP files

If you are going to generate a package from your PP file (typical file
extensions are C<.pd> or C<.pp> for the files containing PP code) it
is easiest and safest to leave generation of the appropriate commands
to the Makefile. In the following we will outline the typical format
of a perl Makefile to automatically build and install your package
from a description in a PP file. Most of the rules to build the xs, pm
and other required files from the PP file are already predefined in
the PDL::Core::Dev package. We just have to tell MakeMaker to use
it.

In most cases you can define your Makefile like

  # Makefile.PL for a package defined by PP code.

  use PDL::Core::Dev;            # Pick up development utilities
  use ExtUtils::MakeMaker;

  $package = ["mylib.pd",Mylib,PDL::Lib::Mylib];
  %hash = pdlpp_stdargs($package);
  $hash{OBJECT} .= ' additional_Ccode$(OBJ_EXT) ';
  $hash{clean}->{FILES} .= ' todelete_Ccode$(OBJ_EXT) ';
  $hash{'VERSION_FROM'} = 'mylib.pd';
  WriteMakefile(%hash);

  sub MY::postamble { pdlpp_postamble($package); }

Here, the list in $package is: first: PP source file name,
then the prefix for the produced files and finally the whole package name.
You can modify the hash in whatever way you like but it would be reasonable
to stay within some limits so that your package will continue to work
with later versions of PDL.

If you don't want to use prepackaged arguments,
here is a generic Makefile.PL that you can adapt for your own
needs:

  # Makefile.PL for a package defined by PP code.

  use PDL::Core::Dev;            # Pick up development utilities
  use ExtUtils::MakeMaker;


  WriteMakefile(
   'NAME'  	=> 'PDL::Lib::Mylib',
   'VERSION_FROM'	=> 'mylib.pd',
   'TYPEMAPS'     => [&PDL_TYPEMAP()],
   'OBJECT'       => 'mylib$(OBJ_EXT) additional_Ccode$(OBJ_EXT)',
   'PM'		=> { 'Mylib.pm'            => '$(INST_LIBDIR)/Mylib.pm'},
   'INC'          => &PDL_INCLUDE(), # add include dirs as required by your lib
   'LIBS'         => [''],   # add link directives as necessary
   'clean'        => {'FILES'  =>
			  'Mylib.pm Mylib.xs Mylib$(OBJ_EXT)
			  additional_Ccode$(OBJ_EXT)'},
  );

  # Add genpp rule; this will invoke PDL::PP on our PP file
  # the argument is an array reference where the array has three string elements:
  #   arg1: name of the source file that contains the PP code
  #   arg2: basename of the xs and pm files to be generated
  #   arg3: name of the package that is to be generated
  sub MY::postamble { pdlpp_postamble(["mylib.pd",Mylib,PDL::Lib::Mylib]); }
To make life even easier PDL::Core::Dev defines the function C<pdlpp_stdargs>
that returns a hash with default values that can be passed (either
directly or after appropriate modification) to a call to WriteMakefile.
Currently, C<pdlpp_stdargs> returns a hash where the keys are filled in
as follows:

	(
	 'NAME'  	=> $mod,
	 'TYPEMAPS'     => [&PDL_TYPEMAP()],
	 'OBJECT'       => "$pref\$(OBJ_EXT)",
	 PM 	=> {"$pref.pm" => "\$(INST_LIBDIR)/$pref.pm"},
	 MAN3PODS => {"$src" => "\$(INST_MAN3DIR)/$mod.\$(MAN3EXT)"},
	 'INC'          => &PDL_INCLUDE(),
	 'LIBS'         => [''],
	 'clean'        => {'FILES'  => "$pref.xs $pref.pm $pref\$(OBJ_EXT)"},
	)

Here, C<$src> is the name of the source file with PP code, C<$pref> the
prefix for the generated .pm and .xs files and C<$mod> the name of the
exntension module to generate.



=head2 INTERNALS

The internals of the current version consist of a large
table which gives the rules according to which things are translated
and the subs which implement these rules.

Later on, it would be good to make the table modifiable by the user
so that different things may be tried.

[Meta comment: here will hopefully be more in the future; currently,
your best bet will be to read the source code :-( or ask on the list
(try the latter first) ]

=head1 Appendix A: Some keys recognised by PDL::PP

Unless otherwise specified, the arguments are strings.

=over 4

=item Pars

define the signature of your function

=item OtherPars

arguments which are not pdls. Default: nothing.

=item Code

the actual code that implements the functionality; several PP macros and
PP functions are recognised in the string value

=item GenericTypes

An array reference. The array may contain any subset of the strings
`B', `S', `U', `L', `F' and `D', which specify which types
your operation will accept.
This is very useful (and important!) when interfacing an external library.
Default: [qw/B S U L F D/]

=item Doc

Used to specify a documentation string in Pod format. See PDL::Doc
for information on PDL documentation conventions. Note: in
the special case where the PP 'Doc' string is one line this is
implicitly used for the quick reference AND the documentation!

If the Doc field is omitted PP will generate default documentation
(after all it knows about the Signature).

If you really want the function NOT to be documented in any way at this point
(e.g. for an internal routine, or because youu are doing it elsewhere in the
code) explictly specify C<Doc=>undef>.

=back

=head1 Appendix B: PP macros and functions

=head2 Macros

=over 7

=item $I<variablename_from_sig>()

access a pdl (by its name) that was specified in the signature

=item $COMP(x)

access a value in the private data structure of this transformation (mainly
used to use an argument that is specified in the C<OtherPar> section)

=item $SIZE(n)

replaced at runtime by the actual size of a I<named> dimension (as specified
in the I<signature>)

=item $GENERIC()

replaced by the C type that is equal to the runtime type of the operation

=item $P(a)

a pointer access to the PDL named C<a> in the signature. Useful for
interfacing to C functions

=item $PP(a)

a physical pointer access to pdl C<a>; mainly for internal use

=item $TXXX(Alternative,Alternative)

expansion alternatives according to runtime type of operation,
where XXX is some string that is matched by /[BSULFD+]/.

=item $PDL(a)

return a pointer to the pdl data structure (pdl *) of piddle C<a>

=back

=head2 functions

=over 3

=item C<loop(DIMS) %{ ... %}>

loop over named dimensions; limits are generated automatically by PP

=item C<threadloop %{ ... %}>

enclose following code in a threadloop

=item C<types(TYPES) %{ ... %}>

execute following code if type of operation is any of C<TYPES>

=back

=head1 SEE ALSO

I<PDL>

For the concepts of threading and slicing check L<PDL::Indexing>.

L<PDL::Internals>

I<perlxs>, I<perlxstut>

=head1 CURRENTLY UNDOCUMENTED

RedoDimsCode, $RESIZE()

=head1 BUGS

PDL::PP is still, even in its rewritten form, too complicated.
It needs to be rethought a little as well as deconvoluted and
modularized some more (e.g. all the NS things).

After the rewrite, this can happen a little by little, though.

=head2 Undocumented functions

The following functions have been added since this manual
was written and are as yet undocumented

=over 5

=item pp_export_nothing

=item pp_core_importList

=item pp_beginwrap

=item pp_setversion

=item pp_addbegin

=back

=head1 AUTHOR

Copyright(C) 1997 Tuomas J. Lukka (lukka@fas.harvard.edu), Karl
Glaazebrook (kgb@aaocbn1.aao.GOV.AU) and Christian Soeller
(c.soeller@auckland.ac.nz). All rights reserved. Although destined for
release as a man page with the standard PDL distribution, it is not
public domain. Permission is granted to freely distribute verbatim
copies of this document provided that no modifications outside of
formatting be made, and that this notice remain intact.  You are
permitted and encouraged to use its code and derivatives thereof in
your own source code for fun or for profit as you see fit.