File: dirfile-format.5

package info (click to toggle)
libgetdata 0.11.0-17
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 13,144 kB
  • sloc: ansic: 100,814; cpp: 4,843; fortran: 4,548; f90: 2,561; python: 2,406; perl: 2,274; makefile: 1,487; php: 1,465; sh: 86
file content (1917 lines) | stat: -rw-r--r-- 51,862 bytes parent folder | download | duplicates (5)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
.\" dirfile-format.5.  The dirfile format specification man page.
.\"
.\" Copyright (C) 2005, 2006, 2008, 2009, 2010, 2012, 2013, 2016, 2017
.\"               D. V. Wiebe
.\"
.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.\"
.\" This file is part of the GetData project.
.\"
.\" Permission is granted to copy, distribute and/or modify this document
.\" under the terms of the GNU Free Documentation License, Version 1.2 or
.\" any later version published by the Free Software Foundation; with no
.\" Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
.\" Texts.  A copy of the license is included in the `COPYING.DOC' file
.\" as part of this distribution.
.\"
.hw name-space
.TH dirfile\-format 5 "19 January 2017" "Standards Version 10" "DATA FORMATS"
.SH NAME
dirfile\-format \(em the dirfile database format specification file
.SH DESCRIPTION
The
.I dirfile format specification
fully specifies the raw and derived time streams and auxiliary information
for a
.BR dirfile (5)
database.

The format specification is contained in one or more case-sensitive text files
located in the dirfile tree.  Each file is known as a
.IR fragment .
The primary fragment is the file called
.B format
located in the base dirfile directory.  This file may contain only part of
the format specification, and may reference other fragments (using the
.B /INCLUDE
directive) containing further format specification.  This inclusion mechanism
may be nested arbitrarily deep.

The explicit text encoding of these files is not specified by these Standards,
but it must be 7\-bit ASCII compatible. Examples of acceptable character
encodings include all the ISO\~8859 character sets
.RI ( i.e.
Latin\-1 through Latin\-10, among others), as well as the UTF\-8 encoding of
Unicode and UCS.

This document primarily describes the latest version of the Standards (Version
10); differences with previous versions are noted where relevant.  A complete
list of changes between versions is given in the
.B HISTORY
section below.

.SH SYNTAX
The format specification is composed of field specification lines and directive lines,
optionally separated by blank lines or lines containing only whitespace.
Lines are separated by the line-feed character (0x0A).  Unless escaped (see
below), the hash mark
.RB ( # )
is the comment delimiter; the comment delimiter, and any text following it to
the end of the line, is ignored.

.SS Tokens
Both field specification lines and directive lines consist of several tokens
separated by whitespace.  Whitespace consists of one or more whitespace
characters.  These are: space (0x20), horizontal tab (0x09), vertical tab
(0x0B), form-feed (0x0C), and carriage return (0x0D).  The first token of a
directive line is always a
.IR "reserved word" .
The first token of a field specification line is never a reserved word.  Any
amount of whitespace may precede the first token on a line.

Since tokens are separated by whitespace, to include a whitespace character in
a token, it must either escaped by preceding it by a backslash character
.RB ( \e ),
or be replaced by a
.I character escape sequence
(see below), or else the token must be enclosed in quotation marks
.RB ( """" ).
The quotation marks themselves are stripped from the token. The
.I null-token
(that is, the token consisting of zero characters) may be specified by a pair
of quotation marks with nothing between them
.RB ( """""" ).
To include a literal quotation mark in a token, it must be escaped
.RB ( \e" ).
Similarly, a hash mark may be included in a token by including it in a quoted
token or else by escaping it
.RB ( \e# ),
otherwise the hash mark is understood as the comment delimiter.

It is a syntax error to have a line which contains unmatched quotation marks, or
in which the last character is the backslash character.

Several characters when escaped by a preceding backslash character are
interpreted as special characters in tokens.  The character escape sequences
are:
.RS
.TP
.B \ea
an alert (bell) character (ASCII 0x07 / U+0007)
.TP
.B \eb
a backspace character (ASCII 0x08 / U+0008)
.TP
.B \ee
an escape character (ASCII 0x1B / U+001B)
.TP
.B \ef
a form-feed character (ASCII 0x0C / U+000C)
.TP
.B \en
a line-feed character (ASCII 0x0A / U+000A)
.TP
.B \er
a carriage return character (ASCII 0x0D / U+000D)
.TP
.B \et
a horizontal tab character (ASCII 0x09 / U+0009)
.TP
.B \ev
a vertical tab character (ASCII 0x0B / U+000B)
.TP
.B \e\e
a backslash character (ASCII 0x5C / U+005C)
.TP
.BI \e ooo
the single byte given by the octal number
.I ooo 
(1 to 3 octal digits).
.TP
.BI \ex hh
the single byte given by the hexadecimal number
.I hh
(1 or 2 hexadecimal digits).
.TP
.BI \eu hhhhhhh
the UTF-8 byte sequence encoding the Unicode code point given by the hexadecimal
number
.I hhhhhhh
(1 to 7 hexadecimal digits).
.RE

Any other character which is escaped is interpreted as the character itself.
.RI ( i.e.
.B \ec
is interpreted as
.BR c ;
also, as pointed out above,
.B \e"
and
.B \e#
are interpreted as simply
.B """"
and
.BR # ,
without their special meanings).

No token may contain the NULL character (ASCII 0x00 / U+0000).  Furthermore,
although support is present to create UTF-8 byte sequences, tokens are not
required to be valid UTF-8 sequences.  Any byte sequence not containing the NULL
character forms a valid token.  However, there may be further restrictions on
allowed characters for a token in a particular situation, (for example, when
used as a field name).

Standards Version 5 and earlier do not recognise the character escape sequences,
nor allow quoting of tokens. As a result, they prohibit both whitespace and the
comment delimiter from being used in tokens. 

.SH DIRECTIVES

There are ten 
.IR directives ,
each specified by a different
.IR "reserved word", 
which cannot be used as field names in the dirfile.  As of Standards Version 8,
all reserved words start with an initial forward slash
.RB ( / ),
to distinguish them from field names.  Standards Versions 5, 6, and 7 permitted
the omission of the initial forward slash, while in Standards Version 4 and
earlier, reserved words may not have an initial forward slash.  Like the rest of
the format specification, directives are case sensitive.

A number of the directives have
.IR "fragment scope" .
A directive with fragment scope only applies to the fragment in which it is
present, plus any sub-fragments indicated by the
.B /INCLUDE
directive, but only if those sub-fragments don't have their own corresponding
directive.  Directives which have fragment scope are:
.BR /ENCODING ", " /ENDIAN ", " /FRAMEOFFSET ", and " /PROTECT .
Because of these scoping rules, different portions of the dirfile may have
different encodings, endiannesses, frame offsets, or protection levels.

If a directive with fragment scope appears more than once in a fragment, only
the last such directive is honoured, with the exception that the effect of
a directive is not propagated to sub-fragments if the directive line
appears after the sub-fragment is included.  The scoping rules of the remaining
directives are discussed below.

.TP
.B /ALIAS
The /ALIAS directive defines an alternate name for a field defined elsewhere in
the format specification (called the "target").  Aliases may not be used as the
parent field in a
.B /META
directive, but are in most other ways indistinguishable from the target's
original, canonical name.  Aliases may be chained (that is, the target name
appearing in an /ALIAS directive may itself be an alias).  In this case, the new
alias is another name for the target's own target.  Just as there is no
requirement that the input fields of a derived field exist,
it is not an error for the target of an alias to not exist.  Syntax is:
.RS
.IP
.B /ALIAS
.I <name> <target>
.RE
.IP
A metafield alias may defined using the
.IR <parent-field> / <alias-name>
syntax for
.I name
in the /ALIAS directive.  No restriction is placed on
.IR target ;
specifically, a metafield alias may target a top-level field, or a metafield
of with a different parent; conversely, a top-level alias may target a
metafield.
.IP
A metafield alias may never appear as the parent part of a metafield field code,
even if it refers to a top-level field.  That is, given the valid format:
.RS
.IP
aaaa \fBRAW UINT8\fR 1
.br
aaaa/bbbb \fBCONST FLOAT64\fR 0.0
.br
cccc \fBRAW UINT8\fR 1
.br
\fB/ALIAS\fR cccc/dddd aaaa
.RE
.IP
the metafield
.I aaaa/bbbb
may not be referred to as
.IR cccc/dddd/bbbb ,
even though
.I cccc/dddd
is a valid field code referring to
.IR aaaa .
.IP
This is not true of top-level aliases: if 
.I eeee
is an alias of
.IR ffff ,
then 
.IR ffff/gggg ,
a metafield of
.IR ffff ,
may be referred to as
.I eeee/gggg
as well.
.IP
The /ALIAS directive has no scope: it is processed immediately.  It appeared in
Standards Version 9.
.TP
.B /ENCODING
The /ENCODING directive specifies the encoding scheme used to encode binary
files in the dirfile.  The encoding scheme may be one of the predefined names
listed below, which are described in more detail in
.BR dirfile\-encoding (5),
or any other site-specific encoding scheme.  The predefined scheme names are:
.RS
.TP
.B none
The dirfile is unencoded.
.TP
.B bzip2
The dirfile is compressed using the bzip2 compression scheme.
.TP
.B flac
The dirfile is compressed using the flac compression scheme.
.TP
.B gzip
The dirfile is compressed using the gzip compression scheme.
.TP
.B lzma
The dirfile is compressed using the LZMA compression scheme.
.TP
.B slim
The dirfile is compressed using the slim compression scheme.
.TP
.B sie
The dirfile is sample-index encoded (a variant of run-length encoding).
.TP
.B text
The dirfile is text encoded.
.TP
.B zzip
The dirfile is compressed and encapsulated using the zzip compression scheme.
.TP
.B zzslim
The dirfile is compressed and encapsulated using a combination of the zzip
and slim compression schemes.
.PP
Implementations should fail gracefully when encountering an unknown encoding
scheme.  If no encoding scheme is specified, behaviour is implementation
dependent.  Syntax is:
.IP
.B /ENCODING \fI<scheme> \fR[\fI<enc-datum>\fR]
.PP
The
.I enc-datum
token provides additional data for certain encoding schemes; see
.BR dirfile-encoding (5)
for details.  The form of enc-datum is not specified.
.PP
The /ENCODING directive has
.IR "fragment scope" .
It appeared in Standards Version 6.  The predefined schemes
.nh
.BR sie ", " zzip ", and " zzslim ,
.hy
and the optional
.I enc-datum
token, appeared in Standards Version 9; the predefined scheme
.B lzma
appeared in Standards Version 7; all other predefined schemes appeared in
Standards Version 6.
.RE
.TP
.B /ENDIAN
The /ENDIAN directive specifies the endianness of the raw data in the database.
The assumed endianness of raw data in dirfiles which omit this directive is
implementation dependent.  Syntax
is:
.RS
.IP
.B /ENDIAN
.RB "( " big " | " little " ) [ " arm " ]"
.PP
where the "arm" token should be included if double precision floating point data
are stored in the ARM middle-endian format.  The /ENDIAN directive has
.IR "fragment scope" .
It appeared in Standards Version 5.  The optional
.B arm
token appeared in Standards Version 8.
.RE
.TP
.B /FRAMEOFFSET
The /FRAMEOFFSET directive specifies the frame number of the first frame for
which data exists in binary files associated with
.B RAW
fields.  Syntax is:
.RS
.IP
.BI /FRAMEOFFSET\~ <integer>
.PP
The /FRAMEOFFSET directive has
.IR "fragment scope" .
It appeared in Standards Version 1.
.RE
.TP
.B /HIDDEN
The /HIDDEN directive indicates that the specified field name is
.IR hidden .
The difference (if any) between a field name which is
.I hidden
and one that is not is implementation dependent.  Hiddenness is not inherited
by metafields of the specified field.  Hiddenness applies to the name, not the
field itself; it does not hide all aliases of the field-name, and if field-name
an alias, the alias is hidden, not its target.  Syntax is:
.RS
.IP
.BR /HIDDEN\~ <field-name>
.PP
A /HIDDEN directive must appear after the specification of
.IR field-name ,
(which occurs either in a field specification line, or an
.B /ALIAS
directive, or a
.B /META
directive) in the same fragment.
.PP
The /HIDDEN directive has no scope: it is processed immediately.  It appeared in
Standards Version 9.
.RE
.TP
.B /INCLUDE
The /INCLUDE directive specifies another file (called a
.IR "fragment" )
to parse for additional format specification for the dirfile.  The inclusion is
processed immediately, before the fragment containing the /INCLUDE directive
(the
.IR "parent fragment" )
is parsed further.  RAW fields specified in the included fragment are located in
the directory containing the fragment file, and not in the directory containing
the parent fragment, and the binary file encoding may be different for each
fragment.  The fragment may be specified either with an absolute path, or else a
path relative to the directory containing the parent fragment.
.IP
The /INCLUDE directive may optionally specify a
.I prefix
and/or
.I suffix
to apply to field names defined in the included fragment.  If present, affixes
are applied to all field-names (including aliases) defined in the included
fragment and any fragments it further includes.  Affixes nest, with the affixes
of the deepest inclusion innermost.  Affixes are not applied to the names of
binary files associated with
.B RAW
fields.  Syntax is:
.RS
.IP
\fB/INCLUDE \fI<file> \fR[\fI<namespace>\fB.\fR][\fI<prefix>\fR]
[\fI<suffix>\fR]
.PP
To specify only
.IR suffix ,
the null-token
.RB ( """""" )
may be used as
.IR prefix .
.PP
A
.I namespace
may also be specified in an /INCLUDE directive by prepending it to
.IR prefix .
The namespace and prefix are separated by a dot
.RB ( . ).
The dot is required whenever a namespace is specified: if the prefix is empty,
the third token should be just the namespace followed by a trailing dot.  If a
namespace is specified, that namespace, relative to the including fragment's
root namespace, becomes the root namespace of the included fragment.  If no
namespace is specified in the /INCLUDE directive, then the current namespace
(specified by a previous /NAMESPACE directive) is used as the root namespace
of the included fragment.  That is, if the current namespace is
.IR current_space ,
then the statement:
.IP
.B /INCLUDE \fIfile newspace\fB.
.PP
is equivalent to
.IP
.B /NAMESPACE \fInewspace
.br
.B /INCLUDE \fIfile
.br
.B /NAMESPACE \fIcurrent_space
.PP
As a result, if no namespace is provided, and there
has been no previous /NAMESPACE directive, the included fragment will have the
same root namespace as the including fragment.

The /INCLUDE directive has no scope: it is processed immediately.  It appeared
in Standards Version 3.  The optional
.I prefix
and
.I suffix
appeared in Standards Version 9.  The optional
.I namespace
appeared in Standards Version 10.
.RE
.TP
.B /META
The /META directive specifies a metafield attached to a particular parent
field.  The field metadata may be of any allowed type except
.BR RAW .
Metafields are retrieved in exactly the same way as regular field data, but the
.I field code
specified consists of the parent and metafield names joined with a forward
slash:
.RS
.IP
.IB <parent-field> / <meta-field>
.PP
META fields may not be specified before their parent field has been.  Syntax is:
.IP
.B /META
.I <parent-field>
{field specification line}
.PP
The
.I <parent-field>
code may not be an alias.  As an illustration of this concept,
.IP
.B /META 
pfield meta
.B CONST FLOAT64
3.291882
.PP
provides a scalar metadatum called
.I meta
with value 3.291882 attached to the field
.IR pfield .
This particular metafield may be referred to by the
.I field code
"pfield/meta".  Note that different parent fields may have metafields with
the same name, since all references to metafields must include the parent
field name.  Metafields may not themselves have further sub-metafields.
.PP
As an alternative to the /META directive, starting with Standards Version 7,
a metafield may be specified by a standard field specification line, using
.IP
.IB <parent-field> / <meta-field>
.PP
as the field name.  That is, the above example metafield could have also been
specified as:
.IP
pfield/meta
.B CONST FLOAT64
3.291882
.PP
The /META directive has no scope: it is processed immediately.  It appeared in
Standards Version 6.
.RE
.TP
.B /NAMESPACE
The /NAMESPACE directive changes the
.IR "current namespace" for subsequent field specification lines.
Syntax is:
.RS
.IP
.BI /NAMESPACE\~ <subspace>
.PP
The
.I subspace
specified is relative to the current fragment's root namespace.  If
.I subspace
is the null-token
.RB ( """""" )
the current namespace will be set back to the root namespace.  Otherwise, the
current namespace will be changed to the concatenation of the root namespace
with subspace, with the two parts separated by a dot:
.IP
.IB rootspace . subspace
.PP
If
.I rootspace
is empty, the intervening dot is omitted, and the current namespace is simply
.IR subspace .
.PP
By default, all field codes, both field names for newly specified fields, and
field codes used as inputs to fields or targets for aliases, are placed in the
current namespace, unless they start with an initial dot, in which case the
current namespace is ignored, and they're placed instead in the
fragment's root namespace.  See the
.B Namespaces
section for further details.
.PP
The /NAMESPACE directive has no scope: it is processed immediately.  For the
effects of changing the current namespace on included fragments, see the
/INCLUDE directive above.  The effects of a /NAMESPACE directive never propagate
upwards to parent fragments.  It appeared in Standards Version 10.
.RE
.TP
.B /PROTECT
The /PROTECT directive specifies the advisory protection level of the current
fragment and of the
.B RAW
fields defined therein.  The protection level indicates whether writing to the
fragment, or the binary data on disk is permitted.  Syntax is:
.RS
.IP
.BI /PROTECT\~ <level>
.PP
Four advisory protection levels are defined:
.TP
.I none
No protection at all: data and metadata may be freely changed.  This is the
default, if no /PROTECT directive is present.
.TP
.I format
The dirfile metadata is protected from change, but
.B RAW
data on disk may be modified.
.TP
.I data
The
.B RAW
data on disk is protected from change, but metadata may be modified.
.TP
.I all
Both metadata and data on disk are protected from change.
.PP
The /PROTECT directive has
.IR "fragment scope" .
It appeared in Standards Version 6.
.RE
.TP
.B /REFERENCE
The /REFERENCE directive specifies the name of the field to use as the dirfile's
reference field (see
.BR dirfile (5)).
If no /REFERENCE directive is specified, the first
.B RAW
field encountered is used as the reference field.  The /REFERENCE directive must
specify a
.B RAW
field.  Syntax is:
.RS
.IP
.BI /REFERENCE\~ <field-code>
.PP
The /REFERENCE directive has
.IR "global scope" :
if multiple /REFERENCE directives appear in the dirfile metadata, only the last
such is honoured.  It appeared in Standards Version 6.
.RE
.TP
.B /VERSION
The /VERSION directive specifies the particular version of the Dirfile Standards
to which the dirfile format specification conforms.  This directive should
occur before any version dependent syntax is encountered.  As of Standards
Version 6, no such syntax exists, and this directive is provided primarily to
ease forward compatibility.  Syntax is:
.RS
.IP
.BI /VERSION\~ <integer>
.PP
The /VERSION directive has
.IR "immediate scope" :
its effect is immediate, and it applies only to metadata below it, including
and propagating downwards to sub-fragments after the directive.
.PP
In Standards Version 8 and earlier, its effect also propagates upwards back to
the parent fragment, and affects subsequent metadata.  Starting with Standards
Version 9, this no longer happens.  As a result, a /VERSION directive which
indicates a version of 9 or later never propagates upwards; additionally,
/VERSION directives found in subfragments included in a Version 9 or later
fragment aren't propagated upwards into that fragment, regardless of the
Version of the subfragments.  The /VERSION directive appeared in Standards
Version 5.
.RE

.SH FIELD SPECIFICATION LINES

Any line which does not start with a
.I reserved word
is assumed to be a field specification line.  A field specification line
consists of at least two tokens.  The first token is the
.IR "field name" .
The second token is the
.IR "field type" .
Subsequent tokens are field parameters.  The meaning and number these parameters
depends on the field type specified.

.SS Field Names
The first token in a field specification line is the
.IR "field name" .
The field name consists of one or more
characters, excluding both ASCII control characters (the bytes 0x01 through
0x1F), and the characters
.IP
.B &\t/\t;\t<\t>\t|\t.
.PP
which are reserved (but see below for the use of
.B /
to specify metafields).
The dot
.RB ( . )
is allowed in Standards Version 5 and earlier.  The ampersand, semicolon,
less-than sign, greater-than sign, and vertical line
.RB ( "& ; < > |" )
are allowed in Standards Version 4 and earlier.  Furthermore, due to the lack
of an escape or quoting mechanism (see 
.B Tokens
above), Standards Version 5 and earlier also prohibit whitespace and the
comment delimiter
.RB ( # )
in field names.
.PP
The field name may not be
.IR INDEX ,
which is a special, implicit field which contains the integer frame index.
Standards Version 5 and earlier also prohibit
.IR FILEFRAM ,
which was an alias for
.IR INDEX .
Field names are case sensitive.  Standards Version 3 and 4 restrict field names
to 50 characters. Standards Version 2 and earlier restrict field names to 16
characters. Additionally, the filesystem may put restrictions on the length 
and acceptable characters of a
.B RAW
field name, regardless of Standards Version. 

Starting in Standards Version 7, if the field name beginning a field
specification line contains exactly one forward slash character
.RB ( / ),
the line is assumed to specify a metafield.  See the
.B /META
directive above for further details.  A field name may not contain more than one
forward slash.

Starting in Standards Version 10, any field name may be preceded by a
.IR "namespace tag" .
The namespace tag and the field name are separated by a dot
.RB ( . ).
See the
.B Namespaces
section, following, for details.

.SS Namespaces
Beginning with Standards Version 10, every field in a Dirfile is contained in a
namespace.  Every namespace is identified by a
.I namespace tag
which consist of the same restricted set of characters used for field names.
Namespaces nest arbitrarily deep.  Subnamespaces are identified by concatenating
all namespace tags, separating tags by dots
.RB ( . ),
with the outermost namespace leftmost:
.RS
.IP
.IB topspace . subspace . subsubspace
.RE
.PP
Each fragment has an immutable
.IR "root namespace".
The root namespace of the primary format file is the null namespace, identified
by the null-token
.RB ( """""" ).
The root namespace of other fragments is specified when they are introduced
(see the /INCLUDE directive).  Each fragment also has a
.I current namespace
which may be changed as often as needed using the /NAMESPACE directive, and
defaults to the root namespace.  The current namespace is always either the root
namespace or else a subspace under the root namespace.

If a field name or field code starts with a leading dot, then that name or code
is taken to be relative to the fragment's root space.  If it does not start with
a dot, it is taken to be relative to the current namespace.

For example, if the both the root namespace and current namespace of a fragment
start off as
.IR rootspace ,
then:
.IP
.IB aaaa\~ "RAW UINT8 1"
.br
.BI . bbbb\~ "RAW UINT8 1"
.br
.IB cccc . dddd\~ "RAW UINT8 1"
.br
.BI . eeee . ffff\~ "RAW UINT8 1"
.br
.BI /NAMESPACE\~ newspace
.br
.IB gggg\~ "RAW UINT8 1"
.br
.BI . hhhh\~ "RAW UINT8 1"
.br
.IB iiii . jjjj\~ "RAW UINT8 1"
.br
.BI . kkkk . llll\~ "RAW UINT8 1"
.PP
specifies, respectively, the fields:
.IP
.IB rootspace . aaaa\fR,
.br
.IB rootspace . bbbb\fR,
.br
.IB rootspace . cccc . dddd\fR,
.br
.IB rootspace . eeee . ffff\fR,
.br
.IB rootspace . newspace . gggg\fR,
.br
.IB rootspace . hhhh\fR,
.br
.IB rootspace . newspace . iiii . jjjj\fR,
and
.br
.IB rootspace . kkkk . llll\fR.
.PP
Note that a field may specify deeper subspaces under either the root namespace
or the current namespace (meaning it is never necessary to use the /NAMESPACE
directive). Note also that there is no way for metadata in a given fragment to
refer to fields outside the fragment's root space.

There is one exception to this namespace scoping rule: the implicit
.I INDEX
vector is always in the null (top-level) namespace, and namespace tags specified
with it, either explicitly or implicitly, even a fragment root namespace, are
ignored.  So, in a fragment with root namespace
.IR rootspace ,
and current namespace
.IR rootspace\fB.\fIsubspace ,
.IP
.IR INDEX ,
.br
.BI . INDEX\fR,
.br
.IB namespace . INDEX\fR,
and
.br
.BI . namespace . INDEX\fR,
.PP
all refer to the same
.I INDEX
field.

.SS Field Types
There are eighteen field types.  Of these, fourteen are of vector type
.RB ( BIT ", " DIVIDE ", " INDIR ", " LINCOM ", " LINTERP ", " MPLEX ,
.BR MULTIPLY ", " PHASE ", " POLYNOM ", " RAW ", " RECIP ", " SBIT ,
.BR SINDIR ", and " WINDOW )
and four are of scalar type
.RB ( CARRAY ", " CONST ", " SARRAY ", and " STRING ).
The thirteen vector field types other than
.B RAW
fields are also called
.IR "derived fields" ,
since they derive their value from one or more input vector fields.  Any other 
vector field may be used as an input vector, including the implicit
.I INDEX
field, but excluding
.B SINDIR
string vectors.
.PP
Five of these derived fields
.RB ( DIVIDE ", " LINCOM ", " MPLEX ", " MULTIPLY ", and " WINDOW )
have more than one vector input field.  In situations where these input fields
have differing sample rates, the sample rate of the derived field is the same
as the sample rate of the first (left-most) input field specified.  Furthermore,
the input fields are synchronised by aligning them on frame boundaries, assuming
equally-spaced sampling throughout a frame, and using the last sample of each
input field which did not occur after the sample of the derived field being
computed.  That is, if the first and second input fields have sample rates
.I s1
and
.IR s2 ,
the derived field also has sample rate
.I s1
and, for every sample of the derived field,
.IR n ,
the
.IR n 'th
sample of the first field is used (since they have the same sample rate by
definition), and the sample number used of the second field,
.IR m ,
is computed as:
.IP
\fIm\fR = \fBfloor\fR((\fIn\fR * \fIs2\fR) / \fIs1\fR).
.PP
Starting in Standards Version 6, certain scalar field parameters in the field
specifications may be specified using
.B CONST
or
.B CARRAY
fields, instead of literal values.  A list of parameters for which this is
allowed is given below in the
.B Field Parameters
section.
.PP
The possible fields types are:
.TP
.B BIT
The BIT vector field type extracts one or more bits out of an input vector
field as an unsigned number.  Syntax is:
.RS
.IP
.I <fieldname>
.B BIT
.I <input> <first-bit> \fR[\fI<num-bits>\fR]
.PP
which specifies
.I fieldname
to be
.I num-bits
bits extracted from the input vector field
.I input
starting with bit number
.I first-bit
(counting from the least-significant bit, which is numbered zero), after
.I input
has been converted from its native type to an (endianness corrected) unsigned
64-bit integer.  If
.I num-bits
is omitted, it is assumed to be one.

The extracted bits are interpreted as an unsigned integer; the
.B SBIT
field type is a signed version of this field type.  The optional
.I num-bits
parameter appeared in Standards Version 1.
.RE
.TP
.B CARRAY
The CARRAY scalar field type is a list of constants fully specified in the
format specification metadata.  Syntax is:
.RS
.IP
.I <fieldname>
.B CARRAY
.I <type> <value0> <value1> <value2> \fR...
.PP
where
.I type
may be any supported native data type (see the description of the
.B RAW
field type below), and
.IR value0 ", " value1 ,
&c. are the values of successive elements in the scalar list interpreted as
indicated by
.IR type .
No limit is placed on the number of elements in a
.BR CARRAY .
(Note: despite being multivalued, this is not considered a vector field since
the elements of the
.B CARRAY
are not indexed by frames.)  CARRAY appeared in Standards Version 8.
.RE
.TP
.B CONST
The CONST scalar field type is a constant fully specified in the format
specification metadata.  Syntax is:
.RS
.IP
.I <fieldname>
.B CONST
.I <type> <value>
.PP
where
.I type
may be any supported native data type (see the description of the
.B RAW
field type below), and
.I value
is the numerical value of the constant interpreted as indicated by
.IR type .
CONST appeared in Standards Version 6.
.RE
.TP
.B DIVIDE
The DIVIDE vector field type is the quotient of two vector fields.  Syntax is:
.RS
.IP
.I <fieldname>
.B DIVIDE
.I <field1> <field1>
.PP
The derived field is computed as:
.IP
fieldname = field1 / field2.
.PP
It was introduced in Standards Version 8.
.RE
.TP
.B INDIR
The INDIR vector field type performs an indirect translation of a CARRAY scalar
field to a derived vector field based on a vector index field.  Syntax is:
.RS
.IP
.I <fieldname>
.B INDIR
.I <index> <array>
.PP
where
.I index
is the vector field, which is converted to an integer type, if necessary, and
.I array
is the CARRAY field.  The
.IR n th
sample of the INDIR field is the value of the
.IR m th
element of
.IR array
(counting from zero), where
.I m
is the value of the
.IR n th
sample of
.IR index .
When
.I index
is not a valid element number of
.IR array ,
the corresponding value of the INDIR is implementation dependent.  INDIR
appeared in Standards Version 10.
.RE
.TP
.B LINCOM
The LINCOM vector field type is the linear combination of one, two or three
input vector fields.  Syntax is:
.RS
.IP
.I <fieldname>
.B LINCOM
.RI [ <n> "] " "<field1> <a1> <b1> " [ "<field2> <a2> <b2> " [ "<field3> <a3>"
.IR <b3> ]]
.PP
where
.IR n ,
if present, indicates the number of input vector fields (1, 2, or 3).  The
derived field is computed as:
.IP
fieldname = (a1 * field1 + b1) + (a2 * field2 + b2) + (a3 * field3 + b3)
.PP
with the
.I field2
and
.I field3
terms included only if specified.

If
.I n
is not specified, the number of fields is determined by looking at the supplied
parameters.  Since it is possible to create a field code which is identical to
a literal number, the third token on the line is assumed to be
.I n
if it the entire token can be parsed as a literal number using the rules
outlined in
.BR strtod (3).
That is, if the field code specifying
.I field1
could be mistaken for a literal number,
.I n
must be specified to prevent ambiguity.  In standards Version 6 and earlier,
.I n
is mandatory.
.RE
.TP
.B LINTERP
The LINTERP vector field type specifies a table look up based on another vector
field.  Syntax is:
.RS
.IP
.I <fieldname>
.B LINTERP
.I <input> <table>
.PP
where
.I input
is the input vector field for the table lookup, and
.I table
is the path to the lookup table file for the field.  If this path is relative,
it is assumed to be relative to the directory containing the fragment defining
this field.  The lookup table file is an ASCII text file with two whitespace
separated columns of
.I x
and
.I y
values.  Values are linearly interpolated between the points specified in the
lookup table.
.RE
.TP
.B MPLEX
The MPLEX vector field type permits the multiplexing of several low sample rate
fields into a single data field of higher sample rate.  Syntax is:
.RS
.IP
.I <fieldname>
.B MPLEX
.I <input> <index> <count> \fR[\fI<period>\fR]
.PP
where
.I input
is the input vector containing the multiplexed fields,
.I index
is the vector containing the mutliplex index,
.I count
is the value of the multiplex index when the computed field is stored in
.IR input ,
and
.IR period ,
if present and non-zero, is the number of samples between successive occurrances
of the value
.I count
in the index vector.  A
.I period 
of zero (or, equivalently, it's omission) indicates that either the value
.I count
is not equally spaced in the index vector, or else that the spacing is unknown. 
Both
.I count
and
.I period
are integers, and
.I period
may not be negative.
.PP
At every sample
.IR n ,
the derived field is computed as:
.IP
fieldname[n] = (index == count) ? input[n] : fieldname[n - 1]
.PP
The
.I index
vector is converted to an integer type for comparison.  The value of the
derived field before the first sample where
.I index
equals
.I count
is implementation dependent.
.PP
The values of
.I count
and
.I period
place no restrictions on values contained in
.IR index .
Specifically, particular values of
.I index
(including
.IR count )
need not be equally spaced (neither by
.I period
nor any other spacing);
.I index
need not ever take on the value
.I count
(in which case the value of the entirety of the derived field is
implementation dependent).  Different MPLEX field definitions which use the
same index vector may specify different
.IR period s.
MPLEX appeared in Standards Version 9.

.RE
.TP
.B MULTIPLY
The MULTIPLY vector field type is the product of two vector fields.  Syntax is:
.RS
.IP
.I <fieldname>
.B MULTIPLY
.I <field1> <field2>
.PP
The derived field is computed as:
.IP
fieldname = field1 * field2.
.PP
MULTIPLY appeared in Standards Version 2.
.RE
.TP
.B PHASE
The PHASE vector field type shifts an input vector field by the specified number
of samples.  Syntax is:
.RS
.IP
.I <fieldname>
.B PHASE
.I <input> <shift>
.PP
which specifies
.I fieldname
to be the input vector field,
.IR input ,
shifted by
.I shift
samples.  A positive
.I shift
indicates a forward shift, towards the end-of-field.  Results of shifting past
the beginning- or end-of-field is implementation dependent.  PHASE appeared in
Standards Version 4.
.RE
.TP
.B POLYNOM
The POLYNOM vector field type specifies a polynomial function of a single input
vector field.  Syntax is:
.RS
.IP
.I <field_name>
.B POLYNOM
.I <input> <a0> <a1>
.RI [ <a2> " [" <a3> " [" <a4> " [" <a5> ]]]]
.PP
where
.I <input>
is the input field code, and the order of the computed polynomial is determined
by how many co-efficients are present in the specification.  The derived field
is computed as:
.IP
fieldname = a0 + a1 * input + a2 * input**2 + a3 * input**3 + a4 * input**4
+ a5 * input**5
.PP
where
.I **
is the element-wise exponentiation operator, and the higher order terms are
computed only if the corresponding co-efficients
.RI a i
are specified.  POLYNOM appeared in Standards Version 7.
.RE
.TP
.B RAW
The RAW vector field type specifies raw time streams on disk.  In this case, the
field name should correspond to the name of the file containing the time stream.
Syntax is:
.RS
.IP
.I <fieldname>
.B RAW
.I <type> <sample-rate>
.PP
where
.I sample-rate
is the number of samples per dirfile frame for the time stream and
.I type
is a token specifying the native data type:
.RS
.TP
.I UINT8
unsigned 8-bit integer
.TP
.I INT8
two's complement signed 8-bit integer
.TP
.I UINT16
unsigned 16-bit integer
.TP
.I INT16
two's complement signed 16-bit integer
.TP
.I UINT32
unsigned 32-bit integer
.TP
.I INT32
two's complement signed 32-bit integer
.TP
.I UINT64
unsigned 64-bit integer
.TP
.I INT64
two's complement signed 64-bit integer
.TP
.I FLOAT32
IEEE-754 standard 32-bit single precision floating point number
.TP
.I FLOAT64
IEEE-754 standard 64-bit double precision floating point number
.TP
.I COMPLEX64
a 64-bit complex number consisting of two IEEE-754 standard 32-bit single
precision floating point numbers representing the real and imaginary parts of
the complex number (Standards Version 7 and later)
.TP
.I COMPLEX128
a 128-bit complex number consisting of two IEEE-754 standard 64-bit double
precision floating point numbers representing the real and imaginary parts of
the complex number (Standards Version 7 and later).
.RE

For more information on the storage of complex valued data, see dirfile(5).
Two additional type names exist:
.I FLOAT
is equivalent to
.IR FLOAT32 ,
and
.I DOUBLE
is equivalent to
.IR FLOAT64 .
Standards Version 9 deprecates these two aliases, but still allows them.

All these type names (except those for complex data, which came later) were
introduced in Standards Version 5.  Earlier Standards Versions specified data
types with single-character type aliases:

.RS
.TP
.I c
UINT8
.TP
.I u
UINT16
.TP
.I s
INT16
.TP
.I U
UINT32
.TP
.IR i ", " S
INT32
.TP
.I f
FLOAT32
.TP
.I d
FLOAT64
.RE

Types
.IR INT8 ", " UINT64 ", " INT64 ", " COMPLEX64 ,
and
.I COMPLEX128
are not supported before Standards Version 5, so no single-character type
aliases exist for these types.  These single-character type aliases were
deprecated in Standards Version 5 and removed in Standards Version 8.
.RE
.TP
.B RECIP
The RECIP vector field type computes the reciprocal of a single input vector
field.  Syntax is:
.RS
.IP
.I <field_name>
.B RECIP
.I <input> <dividend>
.PP
where
.I <input>
is the input field code and
.I <dividend>
is a scalar quantity.  The derived field is computed as:
.IP
fieldname = dividend / input.
.PP
RECIP appeared in Standards Version 8.
.RE
.TP
.B SARRAY
The SARRAY scalar field type is a list of strings fully specified in the format
file metadata.  Syntax is:
.RS
.IP
.I <fieldname>
.B SARRAY
.I <string0> <string1> <string2> \fR...
.PP
Each
.I string
is a single token.  To include whitespace in a string, enclose it in quotation
marks
.RB ( """" ),
or else escape the whitespace with the backslash character
.RB ( \e ).
No limit is placed on the number of elements in a 
.BR SARRAY .
SARRAY appeared in Standards Version 10.
.RE
.TP
.B SBIT
The SBIT vector field type extracts one or more bits out of an input vector
field as a (two's-complement) signed number.  Syntax is:
.RS
.IP
.I <fieldname>
.B SBIT
.I <input> <first-bit> \fR[\fI<num-bits>\fR]
.PP
which specifies
.I fieldname
to be
.I num-bits
bits extracted from the input vector field
.I input
starting with bit number
.I first-bit
(counting from the least-significant bit, which is numbered zero), after
.I input
has been converted from its native type to an (endianness corrected) two's
complement signed 64-bit integer.  If
.I num-bits
is omitted, it is assumed to be one.

The extracted bits are interpreted as a two's complement signed integer of the
specified width. (So,
if
.I num-bits
is, for example, one, then the field can take on the value zero or negative
one.)  The
.B BIT
field type is an unsigned version of this field type.  SBIT appeared in
Standards Version 7.
.RE
.TP
.B SINDIR
The SINDIR vector field type performs an indirect translation of a SARRAY
scalar field to a derived vector field of strings based on a vector index field.
Syntax is:
.RS
.IP
.I <fieldname>
.B SINDIR
.I <index> <array>
.PP
where
.I index
is the vector field, which is converted to an integer type, if necessary, and
.I array
is the SARRAY field.  The
.IR n th
sample of the SINDIR field is the string value of the
.IR m th
element of
.IR array
(counting from zero), where
.I m
is the value of the
.IR n th
sample of
.IR index .
When
.I index
is not a valid element number of
.IR array ,
the corresponding value of the SINDIR is implementation dependent.  SINDIR
appeared in Standards Version 10.
.RE
.TP
.B STRING
The STRING scalar field type is a character string fully specified in the format
file metadata.  Syntax is:
.RS
.IP
.I <fieldname>
.B STRING
.I <string>
.PP
where
.I string
is the string value of the field.  Note that
.I string
is a single token.  To include whitespace in the string, enclose
.I string
in quotation marks
.RB ( """" ),
or else escape the whitespace with the backslash character
.RB ( \e ).
STRING appeared in Standards Version 6.
.RE
.TP
.B WINDOW
The WINDOW vector field type isolates a portion of an input vector based on a 
comparison.  Syntax is:
.RS
.IP
.I <fieldname>
.B WINDOW
.I <input> <check> <op> <threshold>
.PP
where
.I input
is the vector containing the data to extract,
.I check
is the vector on which to test the comparison,
.I threshold
is the value against which
.I check
is compared, and
.I op
is one of the following tokens indicating the particular comparison performed:
.RS
.TP
.I EQ
data are extracted where
.IR check ,
converted to a 64-bit signed integer, equals
.IR threshold ,
.TP
.I GE
data are extracted where
.IR check ,
converted to a 64-bit floating-point number, is greater than or equal to
.IR threshold ,
.TP
.I GT
data are extracted where
.IR check ,
converted to a 64-bit floating-point number, is strictly greater than
.IR threshold ,
.TP
.I LE
data are extracted where
.IR check ,
converted to a 64-bit floating-point number, is less than or equal to
.IR threshold ,
.TP
.I LT
data are extracted where
.IR check ,
converted to a 64-bit floating-point number, is strictly less than
.IR threshold ,
.TP
.I NE
data are extracted where
.IR check ,
converted to a 64-bit signed integer, is not equal to
.IR threshold ,
.TP
.I SET
data are extracted where at least one bit set in
.IR threshold
is also set in
.IR check ,
when converted to a 64-bit unsigned integer,
.TP
.I CLR
data are extracted where at least one bit set in
.IR threshold
is not set in
.IR check ,
when converted to a 64-bit unsigned integer,
.RE
.PP
The storage type of
.I threshold
depends on the operator, and follows the interpretation of
.IR check .
It may never be complex valued.
.PP
Outside the region extracted, the value of the derived field is implementation
dependent.
.PP
Note: with the
.B EQ
operator, this derived field type is very similar to the MPLEX field type above.
The primary difference is that MPLEX mandates the value of the derived field
outside the extracted region, while WINDOW does not.  WINDOW appeared in
Standards Version 9.
.RE

.SS Field Parameters
All input vector field parameters should be
.I field codes
(see below).  Additionally, the scalar field parameters listed may be either
literal numbers or else the
.I field code
of a
.B CONST
field containing the value, or the
.I field code
of a
.B CARRAY
followed by a left angle bracket
.RI ( < ),
then an non-negative integer used as the
.B CARRAY
element index, then a right angle bracket
.RI ( > ),
that is:
.IP
.IB fieldcode < n >
.PP
If the angle
brackets and element index are omitted from a
.B CARRAY
field code used as a parameter, the first element in the field (index zero) is
assumed.
.PP
Field parameters which may be specified using a scalar field code are:
.RS
.TP
.BR BIT ", " SBIT
.IR bitnum ", " numbits
.TP
.B LINCOM
any of the
.IR m "i, or " b i
.TP
.B MPLEX
.IR count ", " max
.TP
.B PHASE
.I shift
.TP
.B POLYNOM
any of the
.IR a i
.TP
.B RAW
.I spf
.TP
.B RECIP
.I dividend
.TP
.B WINDOW
.I threshold
.RE
.PP
Since it is possible to create a field code which is identical to a literal
number, a parameter is assumed to be the field code of a scalar field only if
the entire token cannot be parsed as a literal number using the rules outlined
in
.BR strtod (3).
For example, a
.B CONST
field whose field code consists solely of digits can never be used as a
parameter in a field specification line.

Starting in Standards Version 7, literal complex number is specified as two
real (floating point) numbers separated by a semicolon
.RB ( ; )
with no intervening whitespace.  So, for example, the tokens
.IP
1;0 \t 0;1 \t 4;0 \t 0;5 \t 9.313e2;74.1
.PP
represent, respectively, the real unit, the imaginary unit, the real number
four, the imaginary number
.RI 5 i ,
and the complex number
.RI "931.3 + 74.1" i .
Because the semicolon character cannot be used in field names, a complex valued
literal can never be mistaken for a field code.  This allows, among other
things, the composition of complex valued fields from purely real input fields.
For example, a complex valued field,
.IR z ,
may be created from a real valued field
.IR re ,
representing the real part of the complex number, and the real valued field
.IR im ,
representing the imaginary part of the complex number, with the following
.B LINCOM
specification:
.IP
.I z
.B LINCOM
.I re
1 0
.I im
0;1 0
.PP
Starting in Standards Version 9, in additional to decimal notation, literal
integer parameters may be specified as hexadecimal numbers, by prefixing the
number (after an optional
.RB ' + '
or
.RB ' - '
sign) with
.B 0x
or
.BR 0X ,
or as octal numbers, by prefixing the number with
.BR 0 ,
as described in
.BR strtol (3).
Similarly, floating point literal numbers (both purely real ones and
components of complex literals) may be specified in hexadecimal by prefixing
them with
.B 0x
or
.BR 0X ,
and using
.B p
or
.B P
as the binary exponent prefix, as described in the C99 standard.  Both uppercase
and lowercase hexadecimal digits may be used.  In cases where a literal
floating point number may apear, the tokens
.B INF
or
.BR INFINITY ,
optionally preceded by a
.RB ' + '
or
.RB ' - '
sign, and
.BR NAN ,
optionally immediately followed by
.RB ' ( ',
then a sequence of characters, then
.RB ' ) ',
and all disregarding case, will be interpreted as the special floating point
values explained in
.BR strtod (3).

.SS Field Codes
When specifying the input to a field, either as a scalar parameter, or as an
input vector field to a
.RB non- RAW
vector field,
.I field codes
are used.  A
.I field code
consists of, in order:
.IP \(bu 4
(since Standards Version 10:) optonally, a leading dot
.RB ( . ),
indicating this field code is relative to the fragment's root namespace.
Without the leading dot, the field code is taken to be relative to the current
namespace.  (See the discussion in the 
.B Namespaces
section above for details.)
.IP \(bu 4
(since Standards Version 10:) optionally, a non-null
.I subnamespace
followed by a dot
.RB ( . )
indicating a subspace under the current or root namespace.  The subnamespace may
be made up of any number of namespace tags separated by dots, to nest deeper in
the namespace tree.
.IP \(bu 4
(since Standards Version 6:) if the field in question is a metafield
(see the
.B /META
directive above), the field name of the metafield's parent (which may be an
alias) followed by a forward slash
.RB ( / ).

.IP \(bu 4
a simple field name, possibly an alias, indicating a vector or scalar field
.IP \(bu 4
(since Standards Version 7:) optionally, a dot
.RB ( . )
followed by a
.IR "representation suffix" .
.PP
A 
.IR "representation suffix"
may be used used to extract a real number from a complex value.  The available
suffixes (listed here with their preceding dot) and their meanings are:
.TP
.B .a
the argument of the input, that is, the angle (in radians) between the positive
real axis and the input.  The argument is in the range [-pi, pi], and a branch
cut exists along the negative real axis.  At the branch cut, -pi is returned if
the imaginary part is -0, and pi is returned if the imaginary part is +0.  If
the input is zero, zero is returned.
.TP
.B .i
the imaginary part of the input
.RI ( i.e. \~the
projection of the input onto the imaginary axis)
.TP
.B .m
the modulus of the input
.RI ( i.e. \~its
absolue value).
.TP 
.B .r
the real part of the input
.RI ( i.e. \~the
projection of the input onto the real axis)
.TP
.B .z
(since Standards Version 10:) the identity representation: it returns the full
complex value, equivalent to simply omitting the suffix completely.  It is only
needed in certain cases to force the correct interpretation of a field code in
the presence of a namespace tag.  To wit, the field code
.RS
.IP
name.r
.PP
may be interpreted as the real-part (via the
.B .r
representation suffix)
of the field called
.IR name .
(if such a field exists).  To refer to a field called
.I r
in the
.I name
namespace, the field code must be written:
.IP
name.r.z
.PP
NB: The first interpretation only occurs with valid representation suffixes; the
field code:
.IP
name.q
.PP
is interpreted as the field
.I q
in the
.I name
namespace because
.B .q
is not a valid representation suffix.  Furthermore, ambiguity arises only if
both fields "name" and "name.r" are defined.  if the field "name" does
not exist, but the field "name.r" does, then the original field code is not
ambiguous.  This is the only representation suffix allowed on
.BR SARRAY ,
.BR SINDIR ,
and
.BR STRING
field codes.
.RE
.PP
If the specified field is purely real, representations are calculated as
if the imaginary part were equal to +0.

.SH HISTORY

This document describes Versions 10 and earlier of the Dirfile Standards.

Version 10 of the Standards (January 2017) added the
.BR INDIR ", " SARRAY ,
and
.B SINDIR
field types, namespaces, the
.B /NAMESPACE
directive, the
.B flac
encoding scheme, and the
.I .z
representation suffix.

Version 9 of the Standards (April 2012) added the
.B MPLEX
and
.B WINDOW
field types, the
.B /ALIAS
and
.B /HIDDEN
directives, the affixes to
.BR /INCLUDE ,
the 
.BR sie ", " zzip ,
and
.B zzslim
encoding schemes, along with the optional
.I enc_datum
token to
.BR /ENCODING .
It permitted specification of integer literals in octal and hexadecimal.
Finally, it deprecated the type aliases
.I FLOAT
and
.IR DOUBLE .

Version 8 of the Standards (November 2010) added the
.BR DIVIDE ", " RECIP ,
and
.B CARRAY
field types, made the forward slash on reserved words mandatory, and prohibited
using the single-character type aliases in the specification of
.B RAW
fields.  It also introduced the optional second
.RI ( arm )
token to the
.B /ENDIAN
directive.

Version 7 of the Standards (October 2009) added the
.B SBIT
and
.B POLYNOM
field types, and the directive-less method of specifying metafields.  It also
introduced the data types
.I COMPLEX128
and
.IR COMPLEX64 ,
along with the notion of
.IR representations ,
and the
.B lzma
encoding scheme.  Finally, it made the number of fields parameter for
.I LINCOM
optional.

Version 6 of the Standards (October 2008) added the
.BR /ENCODING ", " /META ", " /PROTECT ", and " /REFERENCE
directives, and the
.B CONST
and
.B STRING
field types.  It permitted whitespace in tokens and introduced the character
escape sequences. It allowed
.B CONST
fields to be used as parameters in field specification lines.  It also removed
.I FILEFRAM
as an alias for
.IR INDEX ,
and prohibited
.BR .
but allowed
.B #
and
.B \e
in field names.

Version 5 of the Standards (August 2008) added
.B VERSION
and
.BR ENDIAN ,
slash demarcation of reserved words, and removed the restriction on field
name length.  It introduced the data types
.IR INT8 ", " INT64 ,
and
.IR UINT64 ,
the new-style type specifiers, and increased the range of the
.B BIT
field type from 32 to 64 bits.  It also prohibited the characters
.B &;<>\e|
in field names.

Version 4 of the Standards (October 2006) added the
.B PHASE
field type.

Version 3 of the Standards (January 2006) added
.B INCLUDE 
and increased the allowed length of a field name from 16 to 50 characters.

Version 2 of the Standards (September 2005) added the
.B MULTIPLY
field type.

Version 1 of the Standards (November 2004) added
.B FRAMEOFFSET
and the optional fourth argument to the
.B BIT
field type.

Version 0 of the Standards (before March 2003) refers to the dirfile standards
supported by the
.BR getdata (3)
library originally introduced into the
.BR kst (1)
sources, which contained support for all other features covered by this
document.

.SH AUTHORS

The dirfile specification was developed by C. B. Netterfield
.nh
<netterfield@astro.utoronto.ca>.
.hy 1

Since Standards Version 3, the dirfile specification has been maintained by
D. V. Wiebe
.nh
<getdata@ketiltrout.net>.
.hy 1

.SH SEE ALSO
.BR dirfile (5),
.BR dirfile\-encoding (5)