1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917
|
.\" dirfile-format.5. The dirfile format specification man page.
.\"
.\" Copyright (C) 2005, 2006, 2008, 2009, 2010, 2012, 2013, 2016, 2017
.\" D. V. Wiebe
.\"
.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.\"
.\" This file is part of the GetData project.
.\"
.\" Permission is granted to copy, distribute and/or modify this document
.\" under the terms of the GNU Free Documentation License, Version 1.2 or
.\" any later version published by the Free Software Foundation; with no
.\" Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
.\" Texts. A copy of the license is included in the `COPYING.DOC' file
.\" as part of this distribution.
.\"
.hw name-space
.TH dirfile\-format 5 "19 January 2017" "Standards Version 10" "DATA FORMATS"
.SH NAME
dirfile\-format \(em the dirfile database format specification file
.SH DESCRIPTION
The
.I dirfile format specification
fully specifies the raw and derived time streams and auxiliary information
for a
.BR dirfile (5)
database.
The format specification is contained in one or more case-sensitive text files
located in the dirfile tree. Each file is known as a
.IR fragment .
The primary fragment is the file called
.B format
located in the base dirfile directory. This file may contain only part of
the format specification, and may reference other fragments (using the
.B /INCLUDE
directive) containing further format specification. This inclusion mechanism
may be nested arbitrarily deep.
The explicit text encoding of these files is not specified by these Standards,
but it must be 7\-bit ASCII compatible. Examples of acceptable character
encodings include all the ISO\~8859 character sets
.RI ( i.e.
Latin\-1 through Latin\-10, among others), as well as the UTF\-8 encoding of
Unicode and UCS.
This document primarily describes the latest version of the Standards (Version
10); differences with previous versions are noted where relevant. A complete
list of changes between versions is given in the
.B HISTORY
section below.
.SH SYNTAX
The format specification is composed of field specification lines and directive lines,
optionally separated by blank lines or lines containing only whitespace.
Lines are separated by the line-feed character (0x0A). Unless escaped (see
below), the hash mark
.RB ( # )
is the comment delimiter; the comment delimiter, and any text following it to
the end of the line, is ignored.
.SS Tokens
Both field specification lines and directive lines consist of several tokens
separated by whitespace. Whitespace consists of one or more whitespace
characters. These are: space (0x20), horizontal tab (0x09), vertical tab
(0x0B), form-feed (0x0C), and carriage return (0x0D). The first token of a
directive line is always a
.IR "reserved word" .
The first token of a field specification line is never a reserved word. Any
amount of whitespace may precede the first token on a line.
Since tokens are separated by whitespace, to include a whitespace character in
a token, it must either escaped by preceding it by a backslash character
.RB ( \e ),
or be replaced by a
.I character escape sequence
(see below), or else the token must be enclosed in quotation marks
.RB ( """" ).
The quotation marks themselves are stripped from the token. The
.I null-token
(that is, the token consisting of zero characters) may be specified by a pair
of quotation marks with nothing between them
.RB ( """""" ).
To include a literal quotation mark in a token, it must be escaped
.RB ( \e" ).
Similarly, a hash mark may be included in a token by including it in a quoted
token or else by escaping it
.RB ( \e# ),
otherwise the hash mark is understood as the comment delimiter.
It is a syntax error to have a line which contains unmatched quotation marks, or
in which the last character is the backslash character.
Several characters when escaped by a preceding backslash character are
interpreted as special characters in tokens. The character escape sequences
are:
.RS
.TP
.B \ea
an alert (bell) character (ASCII 0x07 / U+0007)
.TP
.B \eb
a backspace character (ASCII 0x08 / U+0008)
.TP
.B \ee
an escape character (ASCII 0x1B / U+001B)
.TP
.B \ef
a form-feed character (ASCII 0x0C / U+000C)
.TP
.B \en
a line-feed character (ASCII 0x0A / U+000A)
.TP
.B \er
a carriage return character (ASCII 0x0D / U+000D)
.TP
.B \et
a horizontal tab character (ASCII 0x09 / U+0009)
.TP
.B \ev
a vertical tab character (ASCII 0x0B / U+000B)
.TP
.B \e\e
a backslash character (ASCII 0x5C / U+005C)
.TP
.BI \e ooo
the single byte given by the octal number
.I ooo
(1 to 3 octal digits).
.TP
.BI \ex hh
the single byte given by the hexadecimal number
.I hh
(1 or 2 hexadecimal digits).
.TP
.BI \eu hhhhhhh
the UTF-8 byte sequence encoding the Unicode code point given by the hexadecimal
number
.I hhhhhhh
(1 to 7 hexadecimal digits).
.RE
Any other character which is escaped is interpreted as the character itself.
.RI ( i.e.
.B \ec
is interpreted as
.BR c ;
also, as pointed out above,
.B \e"
and
.B \e#
are interpreted as simply
.B """"
and
.BR # ,
without their special meanings).
No token may contain the NULL character (ASCII 0x00 / U+0000). Furthermore,
although support is present to create UTF-8 byte sequences, tokens are not
required to be valid UTF-8 sequences. Any byte sequence not containing the NULL
character forms a valid token. However, there may be further restrictions on
allowed characters for a token in a particular situation, (for example, when
used as a field name).
Standards Version 5 and earlier do not recognise the character escape sequences,
nor allow quoting of tokens. As a result, they prohibit both whitespace and the
comment delimiter from being used in tokens.
.SH DIRECTIVES
There are ten
.IR directives ,
each specified by a different
.IR "reserved word",
which cannot be used as field names in the dirfile. As of Standards Version 8,
all reserved words start with an initial forward slash
.RB ( / ),
to distinguish them from field names. Standards Versions 5, 6, and 7 permitted
the omission of the initial forward slash, while in Standards Version 4 and
earlier, reserved words may not have an initial forward slash. Like the rest of
the format specification, directives are case sensitive.
A number of the directives have
.IR "fragment scope" .
A directive with fragment scope only applies to the fragment in which it is
present, plus any sub-fragments indicated by the
.B /INCLUDE
directive, but only if those sub-fragments don't have their own corresponding
directive. Directives which have fragment scope are:
.BR /ENCODING ", " /ENDIAN ", " /FRAMEOFFSET ", and " /PROTECT .
Because of these scoping rules, different portions of the dirfile may have
different encodings, endiannesses, frame offsets, or protection levels.
If a directive with fragment scope appears more than once in a fragment, only
the last such directive is honoured, with the exception that the effect of
a directive is not propagated to sub-fragments if the directive line
appears after the sub-fragment is included. The scoping rules of the remaining
directives are discussed below.
.TP
.B /ALIAS
The /ALIAS directive defines an alternate name for a field defined elsewhere in
the format specification (called the "target"). Aliases may not be used as the
parent field in a
.B /META
directive, but are in most other ways indistinguishable from the target's
original, canonical name. Aliases may be chained (that is, the target name
appearing in an /ALIAS directive may itself be an alias). In this case, the new
alias is another name for the target's own target. Just as there is no
requirement that the input fields of a derived field exist,
it is not an error for the target of an alias to not exist. Syntax is:
.RS
.IP
.B /ALIAS
.I <name> <target>
.RE
.IP
A metafield alias may defined using the
.IR <parent-field> / <alias-name>
syntax for
.I name
in the /ALIAS directive. No restriction is placed on
.IR target ;
specifically, a metafield alias may target a top-level field, or a metafield
of with a different parent; conversely, a top-level alias may target a
metafield.
.IP
A metafield alias may never appear as the parent part of a metafield field code,
even if it refers to a top-level field. That is, given the valid format:
.RS
.IP
aaaa \fBRAW UINT8\fR 1
.br
aaaa/bbbb \fBCONST FLOAT64\fR 0.0
.br
cccc \fBRAW UINT8\fR 1
.br
\fB/ALIAS\fR cccc/dddd aaaa
.RE
.IP
the metafield
.I aaaa/bbbb
may not be referred to as
.IR cccc/dddd/bbbb ,
even though
.I cccc/dddd
is a valid field code referring to
.IR aaaa .
.IP
This is not true of top-level aliases: if
.I eeee
is an alias of
.IR ffff ,
then
.IR ffff/gggg ,
a metafield of
.IR ffff ,
may be referred to as
.I eeee/gggg
as well.
.IP
The /ALIAS directive has no scope: it is processed immediately. It appeared in
Standards Version 9.
.TP
.B /ENCODING
The /ENCODING directive specifies the encoding scheme used to encode binary
files in the dirfile. The encoding scheme may be one of the predefined names
listed below, which are described in more detail in
.BR dirfile\-encoding (5),
or any other site-specific encoding scheme. The predefined scheme names are:
.RS
.TP
.B none
The dirfile is unencoded.
.TP
.B bzip2
The dirfile is compressed using the bzip2 compression scheme.
.TP
.B flac
The dirfile is compressed using the flac compression scheme.
.TP
.B gzip
The dirfile is compressed using the gzip compression scheme.
.TP
.B lzma
The dirfile is compressed using the LZMA compression scheme.
.TP
.B slim
The dirfile is compressed using the slim compression scheme.
.TP
.B sie
The dirfile is sample-index encoded (a variant of run-length encoding).
.TP
.B text
The dirfile is text encoded.
.TP
.B zzip
The dirfile is compressed and encapsulated using the zzip compression scheme.
.TP
.B zzslim
The dirfile is compressed and encapsulated using a combination of the zzip
and slim compression schemes.
.PP
Implementations should fail gracefully when encountering an unknown encoding
scheme. If no encoding scheme is specified, behaviour is implementation
dependent. Syntax is:
.IP
.B /ENCODING \fI<scheme> \fR[\fI<enc-datum>\fR]
.PP
The
.I enc-datum
token provides additional data for certain encoding schemes; see
.BR dirfile-encoding (5)
for details. The form of enc-datum is not specified.
.PP
The /ENCODING directive has
.IR "fragment scope" .
It appeared in Standards Version 6. The predefined schemes
.nh
.BR sie ", " zzip ", and " zzslim ,
.hy
and the optional
.I enc-datum
token, appeared in Standards Version 9; the predefined scheme
.B lzma
appeared in Standards Version 7; all other predefined schemes appeared in
Standards Version 6.
.RE
.TP
.B /ENDIAN
The /ENDIAN directive specifies the endianness of the raw data in the database.
The assumed endianness of raw data in dirfiles which omit this directive is
implementation dependent. Syntax
is:
.RS
.IP
.B /ENDIAN
.RB "( " big " | " little " ) [ " arm " ]"
.PP
where the "arm" token should be included if double precision floating point data
are stored in the ARM middle-endian format. The /ENDIAN directive has
.IR "fragment scope" .
It appeared in Standards Version 5. The optional
.B arm
token appeared in Standards Version 8.
.RE
.TP
.B /FRAMEOFFSET
The /FRAMEOFFSET directive specifies the frame number of the first frame for
which data exists in binary files associated with
.B RAW
fields. Syntax is:
.RS
.IP
.BI /FRAMEOFFSET\~ <integer>
.PP
The /FRAMEOFFSET directive has
.IR "fragment scope" .
It appeared in Standards Version 1.
.RE
.TP
.B /HIDDEN
The /HIDDEN directive indicates that the specified field name is
.IR hidden .
The difference (if any) between a field name which is
.I hidden
and one that is not is implementation dependent. Hiddenness is not inherited
by metafields of the specified field. Hiddenness applies to the name, not the
field itself; it does not hide all aliases of the field-name, and if field-name
an alias, the alias is hidden, not its target. Syntax is:
.RS
.IP
.BR /HIDDEN\~ <field-name>
.PP
A /HIDDEN directive must appear after the specification of
.IR field-name ,
(which occurs either in a field specification line, or an
.B /ALIAS
directive, or a
.B /META
directive) in the same fragment.
.PP
The /HIDDEN directive has no scope: it is processed immediately. It appeared in
Standards Version 9.
.RE
.TP
.B /INCLUDE
The /INCLUDE directive specifies another file (called a
.IR "fragment" )
to parse for additional format specification for the dirfile. The inclusion is
processed immediately, before the fragment containing the /INCLUDE directive
(the
.IR "parent fragment" )
is parsed further. RAW fields specified in the included fragment are located in
the directory containing the fragment file, and not in the directory containing
the parent fragment, and the binary file encoding may be different for each
fragment. The fragment may be specified either with an absolute path, or else a
path relative to the directory containing the parent fragment.
.IP
The /INCLUDE directive may optionally specify a
.I prefix
and/or
.I suffix
to apply to field names defined in the included fragment. If present, affixes
are applied to all field-names (including aliases) defined in the included
fragment and any fragments it further includes. Affixes nest, with the affixes
of the deepest inclusion innermost. Affixes are not applied to the names of
binary files associated with
.B RAW
fields. Syntax is:
.RS
.IP
\fB/INCLUDE \fI<file> \fR[\fI<namespace>\fB.\fR][\fI<prefix>\fR]
[\fI<suffix>\fR]
.PP
To specify only
.IR suffix ,
the null-token
.RB ( """""" )
may be used as
.IR prefix .
.PP
A
.I namespace
may also be specified in an /INCLUDE directive by prepending it to
.IR prefix .
The namespace and prefix are separated by a dot
.RB ( . ).
The dot is required whenever a namespace is specified: if the prefix is empty,
the third token should be just the namespace followed by a trailing dot. If a
namespace is specified, that namespace, relative to the including fragment's
root namespace, becomes the root namespace of the included fragment. If no
namespace is specified in the /INCLUDE directive, then the current namespace
(specified by a previous /NAMESPACE directive) is used as the root namespace
of the included fragment. That is, if the current namespace is
.IR current_space ,
then the statement:
.IP
.B /INCLUDE \fIfile newspace\fB.
.PP
is equivalent to
.IP
.B /NAMESPACE \fInewspace
.br
.B /INCLUDE \fIfile
.br
.B /NAMESPACE \fIcurrent_space
.PP
As a result, if no namespace is provided, and there
has been no previous /NAMESPACE directive, the included fragment will have the
same root namespace as the including fragment.
The /INCLUDE directive has no scope: it is processed immediately. It appeared
in Standards Version 3. The optional
.I prefix
and
.I suffix
appeared in Standards Version 9. The optional
.I namespace
appeared in Standards Version 10.
.RE
.TP
.B /META
The /META directive specifies a metafield attached to a particular parent
field. The field metadata may be of any allowed type except
.BR RAW .
Metafields are retrieved in exactly the same way as regular field data, but the
.I field code
specified consists of the parent and metafield names joined with a forward
slash:
.RS
.IP
.IB <parent-field> / <meta-field>
.PP
META fields may not be specified before their parent field has been. Syntax is:
.IP
.B /META
.I <parent-field>
{field specification line}
.PP
The
.I <parent-field>
code may not be an alias. As an illustration of this concept,
.IP
.B /META
pfield meta
.B CONST FLOAT64
3.291882
.PP
provides a scalar metadatum called
.I meta
with value 3.291882 attached to the field
.IR pfield .
This particular metafield may be referred to by the
.I field code
"pfield/meta". Note that different parent fields may have metafields with
the same name, since all references to metafields must include the parent
field name. Metafields may not themselves have further sub-metafields.
.PP
As an alternative to the /META directive, starting with Standards Version 7,
a metafield may be specified by a standard field specification line, using
.IP
.IB <parent-field> / <meta-field>
.PP
as the field name. That is, the above example metafield could have also been
specified as:
.IP
pfield/meta
.B CONST FLOAT64
3.291882
.PP
The /META directive has no scope: it is processed immediately. It appeared in
Standards Version 6.
.RE
.TP
.B /NAMESPACE
The /NAMESPACE directive changes the
.IR "current namespace" for subsequent field specification lines.
Syntax is:
.RS
.IP
.BI /NAMESPACE\~ <subspace>
.PP
The
.I subspace
specified is relative to the current fragment's root namespace. If
.I subspace
is the null-token
.RB ( """""" )
the current namespace will be set back to the root namespace. Otherwise, the
current namespace will be changed to the concatenation of the root namespace
with subspace, with the two parts separated by a dot:
.IP
.IB rootspace . subspace
.PP
If
.I rootspace
is empty, the intervening dot is omitted, and the current namespace is simply
.IR subspace .
.PP
By default, all field codes, both field names for newly specified fields, and
field codes used as inputs to fields or targets for aliases, are placed in the
current namespace, unless they start with an initial dot, in which case the
current namespace is ignored, and they're placed instead in the
fragment's root namespace. See the
.B Namespaces
section for further details.
.PP
The /NAMESPACE directive has no scope: it is processed immediately. For the
effects of changing the current namespace on included fragments, see the
/INCLUDE directive above. The effects of a /NAMESPACE directive never propagate
upwards to parent fragments. It appeared in Standards Version 10.
.RE
.TP
.B /PROTECT
The /PROTECT directive specifies the advisory protection level of the current
fragment and of the
.B RAW
fields defined therein. The protection level indicates whether writing to the
fragment, or the binary data on disk is permitted. Syntax is:
.RS
.IP
.BI /PROTECT\~ <level>
.PP
Four advisory protection levels are defined:
.TP
.I none
No protection at all: data and metadata may be freely changed. This is the
default, if no /PROTECT directive is present.
.TP
.I format
The dirfile metadata is protected from change, but
.B RAW
data on disk may be modified.
.TP
.I data
The
.B RAW
data on disk is protected from change, but metadata may be modified.
.TP
.I all
Both metadata and data on disk are protected from change.
.PP
The /PROTECT directive has
.IR "fragment scope" .
It appeared in Standards Version 6.
.RE
.TP
.B /REFERENCE
The /REFERENCE directive specifies the name of the field to use as the dirfile's
reference field (see
.BR dirfile (5)).
If no /REFERENCE directive is specified, the first
.B RAW
field encountered is used as the reference field. The /REFERENCE directive must
specify a
.B RAW
field. Syntax is:
.RS
.IP
.BI /REFERENCE\~ <field-code>
.PP
The /REFERENCE directive has
.IR "global scope" :
if multiple /REFERENCE directives appear in the dirfile metadata, only the last
such is honoured. It appeared in Standards Version 6.
.RE
.TP
.B /VERSION
The /VERSION directive specifies the particular version of the Dirfile Standards
to which the dirfile format specification conforms. This directive should
occur before any version dependent syntax is encountered. As of Standards
Version 6, no such syntax exists, and this directive is provided primarily to
ease forward compatibility. Syntax is:
.RS
.IP
.BI /VERSION\~ <integer>
.PP
The /VERSION directive has
.IR "immediate scope" :
its effect is immediate, and it applies only to metadata below it, including
and propagating downwards to sub-fragments after the directive.
.PP
In Standards Version 8 and earlier, its effect also propagates upwards back to
the parent fragment, and affects subsequent metadata. Starting with Standards
Version 9, this no longer happens. As a result, a /VERSION directive which
indicates a version of 9 or later never propagates upwards; additionally,
/VERSION directives found in subfragments included in a Version 9 or later
fragment aren't propagated upwards into that fragment, regardless of the
Version of the subfragments. The /VERSION directive appeared in Standards
Version 5.
.RE
.SH FIELD SPECIFICATION LINES
Any line which does not start with a
.I reserved word
is assumed to be a field specification line. A field specification line
consists of at least two tokens. The first token is the
.IR "field name" .
The second token is the
.IR "field type" .
Subsequent tokens are field parameters. The meaning and number these parameters
depends on the field type specified.
.SS Field Names
The first token in a field specification line is the
.IR "field name" .
The field name consists of one or more
characters, excluding both ASCII control characters (the bytes 0x01 through
0x1F), and the characters
.IP
.B &\t/\t;\t<\t>\t|\t.
.PP
which are reserved (but see below for the use of
.B /
to specify metafields).
The dot
.RB ( . )
is allowed in Standards Version 5 and earlier. The ampersand, semicolon,
less-than sign, greater-than sign, and vertical line
.RB ( "& ; < > |" )
are allowed in Standards Version 4 and earlier. Furthermore, due to the lack
of an escape or quoting mechanism (see
.B Tokens
above), Standards Version 5 and earlier also prohibit whitespace and the
comment delimiter
.RB ( # )
in field names.
.PP
The field name may not be
.IR INDEX ,
which is a special, implicit field which contains the integer frame index.
Standards Version 5 and earlier also prohibit
.IR FILEFRAM ,
which was an alias for
.IR INDEX .
Field names are case sensitive. Standards Version 3 and 4 restrict field names
to 50 characters. Standards Version 2 and earlier restrict field names to 16
characters. Additionally, the filesystem may put restrictions on the length
and acceptable characters of a
.B RAW
field name, regardless of Standards Version.
Starting in Standards Version 7, if the field name beginning a field
specification line contains exactly one forward slash character
.RB ( / ),
the line is assumed to specify a metafield. See the
.B /META
directive above for further details. A field name may not contain more than one
forward slash.
Starting in Standards Version 10, any field name may be preceded by a
.IR "namespace tag" .
The namespace tag and the field name are separated by a dot
.RB ( . ).
See the
.B Namespaces
section, following, for details.
.SS Namespaces
Beginning with Standards Version 10, every field in a Dirfile is contained in a
namespace. Every namespace is identified by a
.I namespace tag
which consist of the same restricted set of characters used for field names.
Namespaces nest arbitrarily deep. Subnamespaces are identified by concatenating
all namespace tags, separating tags by dots
.RB ( . ),
with the outermost namespace leftmost:
.RS
.IP
.IB topspace . subspace . subsubspace
.RE
.PP
Each fragment has an immutable
.IR "root namespace".
The root namespace of the primary format file is the null namespace, identified
by the null-token
.RB ( """""" ).
The root namespace of other fragments is specified when they are introduced
(see the /INCLUDE directive). Each fragment also has a
.I current namespace
which may be changed as often as needed using the /NAMESPACE directive, and
defaults to the root namespace. The current namespace is always either the root
namespace or else a subspace under the root namespace.
If a field name or field code starts with a leading dot, then that name or code
is taken to be relative to the fragment's root space. If it does not start with
a dot, it is taken to be relative to the current namespace.
For example, if the both the root namespace and current namespace of a fragment
start off as
.IR rootspace ,
then:
.IP
.IB aaaa\~ "RAW UINT8 1"
.br
.BI . bbbb\~ "RAW UINT8 1"
.br
.IB cccc . dddd\~ "RAW UINT8 1"
.br
.BI . eeee . ffff\~ "RAW UINT8 1"
.br
.BI /NAMESPACE\~ newspace
.br
.IB gggg\~ "RAW UINT8 1"
.br
.BI . hhhh\~ "RAW UINT8 1"
.br
.IB iiii . jjjj\~ "RAW UINT8 1"
.br
.BI . kkkk . llll\~ "RAW UINT8 1"
.PP
specifies, respectively, the fields:
.IP
.IB rootspace . aaaa\fR,
.br
.IB rootspace . bbbb\fR,
.br
.IB rootspace . cccc . dddd\fR,
.br
.IB rootspace . eeee . ffff\fR,
.br
.IB rootspace . newspace . gggg\fR,
.br
.IB rootspace . hhhh\fR,
.br
.IB rootspace . newspace . iiii . jjjj\fR,
and
.br
.IB rootspace . kkkk . llll\fR.
.PP
Note that a field may specify deeper subspaces under either the root namespace
or the current namespace (meaning it is never necessary to use the /NAMESPACE
directive). Note also that there is no way for metadata in a given fragment to
refer to fields outside the fragment's root space.
There is one exception to this namespace scoping rule: the implicit
.I INDEX
vector is always in the null (top-level) namespace, and namespace tags specified
with it, either explicitly or implicitly, even a fragment root namespace, are
ignored. So, in a fragment with root namespace
.IR rootspace ,
and current namespace
.IR rootspace\fB.\fIsubspace ,
.IP
.IR INDEX ,
.br
.BI . INDEX\fR,
.br
.IB namespace . INDEX\fR,
and
.br
.BI . namespace . INDEX\fR,
.PP
all refer to the same
.I INDEX
field.
.SS Field Types
There are eighteen field types. Of these, fourteen are of vector type
.RB ( BIT ", " DIVIDE ", " INDIR ", " LINCOM ", " LINTERP ", " MPLEX ,
.BR MULTIPLY ", " PHASE ", " POLYNOM ", " RAW ", " RECIP ", " SBIT ,
.BR SINDIR ", and " WINDOW )
and four are of scalar type
.RB ( CARRAY ", " CONST ", " SARRAY ", and " STRING ).
The thirteen vector field types other than
.B RAW
fields are also called
.IR "derived fields" ,
since they derive their value from one or more input vector fields. Any other
vector field may be used as an input vector, including the implicit
.I INDEX
field, but excluding
.B SINDIR
string vectors.
.PP
Five of these derived fields
.RB ( DIVIDE ", " LINCOM ", " MPLEX ", " MULTIPLY ", and " WINDOW )
have more than one vector input field. In situations where these input fields
have differing sample rates, the sample rate of the derived field is the same
as the sample rate of the first (left-most) input field specified. Furthermore,
the input fields are synchronised by aligning them on frame boundaries, assuming
equally-spaced sampling throughout a frame, and using the last sample of each
input field which did not occur after the sample of the derived field being
computed. That is, if the first and second input fields have sample rates
.I s1
and
.IR s2 ,
the derived field also has sample rate
.I s1
and, for every sample of the derived field,
.IR n ,
the
.IR n 'th
sample of the first field is used (since they have the same sample rate by
definition), and the sample number used of the second field,
.IR m ,
is computed as:
.IP
\fIm\fR = \fBfloor\fR((\fIn\fR * \fIs2\fR) / \fIs1\fR).
.PP
Starting in Standards Version 6, certain scalar field parameters in the field
specifications may be specified using
.B CONST
or
.B CARRAY
fields, instead of literal values. A list of parameters for which this is
allowed is given below in the
.B Field Parameters
section.
.PP
The possible fields types are:
.TP
.B BIT
The BIT vector field type extracts one or more bits out of an input vector
field as an unsigned number. Syntax is:
.RS
.IP
.I <fieldname>
.B BIT
.I <input> <first-bit> \fR[\fI<num-bits>\fR]
.PP
which specifies
.I fieldname
to be
.I num-bits
bits extracted from the input vector field
.I input
starting with bit number
.I first-bit
(counting from the least-significant bit, which is numbered zero), after
.I input
has been converted from its native type to an (endianness corrected) unsigned
64-bit integer. If
.I num-bits
is omitted, it is assumed to be one.
The extracted bits are interpreted as an unsigned integer; the
.B SBIT
field type is a signed version of this field type. The optional
.I num-bits
parameter appeared in Standards Version 1.
.RE
.TP
.B CARRAY
The CARRAY scalar field type is a list of constants fully specified in the
format specification metadata. Syntax is:
.RS
.IP
.I <fieldname>
.B CARRAY
.I <type> <value0> <value1> <value2> \fR...
.PP
where
.I type
may be any supported native data type (see the description of the
.B RAW
field type below), and
.IR value0 ", " value1 ,
&c. are the values of successive elements in the scalar list interpreted as
indicated by
.IR type .
No limit is placed on the number of elements in a
.BR CARRAY .
(Note: despite being multivalued, this is not considered a vector field since
the elements of the
.B CARRAY
are not indexed by frames.) CARRAY appeared in Standards Version 8.
.RE
.TP
.B CONST
The CONST scalar field type is a constant fully specified in the format
specification metadata. Syntax is:
.RS
.IP
.I <fieldname>
.B CONST
.I <type> <value>
.PP
where
.I type
may be any supported native data type (see the description of the
.B RAW
field type below), and
.I value
is the numerical value of the constant interpreted as indicated by
.IR type .
CONST appeared in Standards Version 6.
.RE
.TP
.B DIVIDE
The DIVIDE vector field type is the quotient of two vector fields. Syntax is:
.RS
.IP
.I <fieldname>
.B DIVIDE
.I <field1> <field1>
.PP
The derived field is computed as:
.IP
fieldname = field1 / field2.
.PP
It was introduced in Standards Version 8.
.RE
.TP
.B INDIR
The INDIR vector field type performs an indirect translation of a CARRAY scalar
field to a derived vector field based on a vector index field. Syntax is:
.RS
.IP
.I <fieldname>
.B INDIR
.I <index> <array>
.PP
where
.I index
is the vector field, which is converted to an integer type, if necessary, and
.I array
is the CARRAY field. The
.IR n th
sample of the INDIR field is the value of the
.IR m th
element of
.IR array
(counting from zero), where
.I m
is the value of the
.IR n th
sample of
.IR index .
When
.I index
is not a valid element number of
.IR array ,
the corresponding value of the INDIR is implementation dependent. INDIR
appeared in Standards Version 10.
.RE
.TP
.B LINCOM
The LINCOM vector field type is the linear combination of one, two or three
input vector fields. Syntax is:
.RS
.IP
.I <fieldname>
.B LINCOM
.RI [ <n> "] " "<field1> <a1> <b1> " [ "<field2> <a2> <b2> " [ "<field3> <a3>"
.IR <b3> ]]
.PP
where
.IR n ,
if present, indicates the number of input vector fields (1, 2, or 3). The
derived field is computed as:
.IP
fieldname = (a1 * field1 + b1) + (a2 * field2 + b2) + (a3 * field3 + b3)
.PP
with the
.I field2
and
.I field3
terms included only if specified.
If
.I n
is not specified, the number of fields is determined by looking at the supplied
parameters. Since it is possible to create a field code which is identical to
a literal number, the third token on the line is assumed to be
.I n
if it the entire token can be parsed as a literal number using the rules
outlined in
.BR strtod (3).
That is, if the field code specifying
.I field1
could be mistaken for a literal number,
.I n
must be specified to prevent ambiguity. In standards Version 6 and earlier,
.I n
is mandatory.
.RE
.TP
.B LINTERP
The LINTERP vector field type specifies a table look up based on another vector
field. Syntax is:
.RS
.IP
.I <fieldname>
.B LINTERP
.I <input> <table>
.PP
where
.I input
is the input vector field for the table lookup, and
.I table
is the path to the lookup table file for the field. If this path is relative,
it is assumed to be relative to the directory containing the fragment defining
this field. The lookup table file is an ASCII text file with two whitespace
separated columns of
.I x
and
.I y
values. Values are linearly interpolated between the points specified in the
lookup table.
.RE
.TP
.B MPLEX
The MPLEX vector field type permits the multiplexing of several low sample rate
fields into a single data field of higher sample rate. Syntax is:
.RS
.IP
.I <fieldname>
.B MPLEX
.I <input> <index> <count> \fR[\fI<period>\fR]
.PP
where
.I input
is the input vector containing the multiplexed fields,
.I index
is the vector containing the mutliplex index,
.I count
is the value of the multiplex index when the computed field is stored in
.IR input ,
and
.IR period ,
if present and non-zero, is the number of samples between successive occurrances
of the value
.I count
in the index vector. A
.I period
of zero (or, equivalently, it's omission) indicates that either the value
.I count
is not equally spaced in the index vector, or else that the spacing is unknown.
Both
.I count
and
.I period
are integers, and
.I period
may not be negative.
.PP
At every sample
.IR n ,
the derived field is computed as:
.IP
fieldname[n] = (index == count) ? input[n] : fieldname[n - 1]
.PP
The
.I index
vector is converted to an integer type for comparison. The value of the
derived field before the first sample where
.I index
equals
.I count
is implementation dependent.
.PP
The values of
.I count
and
.I period
place no restrictions on values contained in
.IR index .
Specifically, particular values of
.I index
(including
.IR count )
need not be equally spaced (neither by
.I period
nor any other spacing);
.I index
need not ever take on the value
.I count
(in which case the value of the entirety of the derived field is
implementation dependent). Different MPLEX field definitions which use the
same index vector may specify different
.IR period s.
MPLEX appeared in Standards Version 9.
.RE
.TP
.B MULTIPLY
The MULTIPLY vector field type is the product of two vector fields. Syntax is:
.RS
.IP
.I <fieldname>
.B MULTIPLY
.I <field1> <field2>
.PP
The derived field is computed as:
.IP
fieldname = field1 * field2.
.PP
MULTIPLY appeared in Standards Version 2.
.RE
.TP
.B PHASE
The PHASE vector field type shifts an input vector field by the specified number
of samples. Syntax is:
.RS
.IP
.I <fieldname>
.B PHASE
.I <input> <shift>
.PP
which specifies
.I fieldname
to be the input vector field,
.IR input ,
shifted by
.I shift
samples. A positive
.I shift
indicates a forward shift, towards the end-of-field. Results of shifting past
the beginning- or end-of-field is implementation dependent. PHASE appeared in
Standards Version 4.
.RE
.TP
.B POLYNOM
The POLYNOM vector field type specifies a polynomial function of a single input
vector field. Syntax is:
.RS
.IP
.I <field_name>
.B POLYNOM
.I <input> <a0> <a1>
.RI [ <a2> " [" <a3> " [" <a4> " [" <a5> ]]]]
.PP
where
.I <input>
is the input field code, and the order of the computed polynomial is determined
by how many co-efficients are present in the specification. The derived field
is computed as:
.IP
fieldname = a0 + a1 * input + a2 * input**2 + a3 * input**3 + a4 * input**4
+ a5 * input**5
.PP
where
.I **
is the element-wise exponentiation operator, and the higher order terms are
computed only if the corresponding co-efficients
.RI a i
are specified. POLYNOM appeared in Standards Version 7.
.RE
.TP
.B RAW
The RAW vector field type specifies raw time streams on disk. In this case, the
field name should correspond to the name of the file containing the time stream.
Syntax is:
.RS
.IP
.I <fieldname>
.B RAW
.I <type> <sample-rate>
.PP
where
.I sample-rate
is the number of samples per dirfile frame for the time stream and
.I type
is a token specifying the native data type:
.RS
.TP
.I UINT8
unsigned 8-bit integer
.TP
.I INT8
two's complement signed 8-bit integer
.TP
.I UINT16
unsigned 16-bit integer
.TP
.I INT16
two's complement signed 16-bit integer
.TP
.I UINT32
unsigned 32-bit integer
.TP
.I INT32
two's complement signed 32-bit integer
.TP
.I UINT64
unsigned 64-bit integer
.TP
.I INT64
two's complement signed 64-bit integer
.TP
.I FLOAT32
IEEE-754 standard 32-bit single precision floating point number
.TP
.I FLOAT64
IEEE-754 standard 64-bit double precision floating point number
.TP
.I COMPLEX64
a 64-bit complex number consisting of two IEEE-754 standard 32-bit single
precision floating point numbers representing the real and imaginary parts of
the complex number (Standards Version 7 and later)
.TP
.I COMPLEX128
a 128-bit complex number consisting of two IEEE-754 standard 64-bit double
precision floating point numbers representing the real and imaginary parts of
the complex number (Standards Version 7 and later).
.RE
For more information on the storage of complex valued data, see dirfile(5).
Two additional type names exist:
.I FLOAT
is equivalent to
.IR FLOAT32 ,
and
.I DOUBLE
is equivalent to
.IR FLOAT64 .
Standards Version 9 deprecates these two aliases, but still allows them.
All these type names (except those for complex data, which came later) were
introduced in Standards Version 5. Earlier Standards Versions specified data
types with single-character type aliases:
.RS
.TP
.I c
UINT8
.TP
.I u
UINT16
.TP
.I s
INT16
.TP
.I U
UINT32
.TP
.IR i ", " S
INT32
.TP
.I f
FLOAT32
.TP
.I d
FLOAT64
.RE
Types
.IR INT8 ", " UINT64 ", " INT64 ", " COMPLEX64 ,
and
.I COMPLEX128
are not supported before Standards Version 5, so no single-character type
aliases exist for these types. These single-character type aliases were
deprecated in Standards Version 5 and removed in Standards Version 8.
.RE
.TP
.B RECIP
The RECIP vector field type computes the reciprocal of a single input vector
field. Syntax is:
.RS
.IP
.I <field_name>
.B RECIP
.I <input> <dividend>
.PP
where
.I <input>
is the input field code and
.I <dividend>
is a scalar quantity. The derived field is computed as:
.IP
fieldname = dividend / input.
.PP
RECIP appeared in Standards Version 8.
.RE
.TP
.B SARRAY
The SARRAY scalar field type is a list of strings fully specified in the format
file metadata. Syntax is:
.RS
.IP
.I <fieldname>
.B SARRAY
.I <string0> <string1> <string2> \fR...
.PP
Each
.I string
is a single token. To include whitespace in a string, enclose it in quotation
marks
.RB ( """" ),
or else escape the whitespace with the backslash character
.RB ( \e ).
No limit is placed on the number of elements in a
.BR SARRAY .
SARRAY appeared in Standards Version 10.
.RE
.TP
.B SBIT
The SBIT vector field type extracts one or more bits out of an input vector
field as a (two's-complement) signed number. Syntax is:
.RS
.IP
.I <fieldname>
.B SBIT
.I <input> <first-bit> \fR[\fI<num-bits>\fR]
.PP
which specifies
.I fieldname
to be
.I num-bits
bits extracted from the input vector field
.I input
starting with bit number
.I first-bit
(counting from the least-significant bit, which is numbered zero), after
.I input
has been converted from its native type to an (endianness corrected) two's
complement signed 64-bit integer. If
.I num-bits
is omitted, it is assumed to be one.
The extracted bits are interpreted as a two's complement signed integer of the
specified width. (So,
if
.I num-bits
is, for example, one, then the field can take on the value zero or negative
one.) The
.B BIT
field type is an unsigned version of this field type. SBIT appeared in
Standards Version 7.
.RE
.TP
.B SINDIR
The SINDIR vector field type performs an indirect translation of a SARRAY
scalar field to a derived vector field of strings based on a vector index field.
Syntax is:
.RS
.IP
.I <fieldname>
.B SINDIR
.I <index> <array>
.PP
where
.I index
is the vector field, which is converted to an integer type, if necessary, and
.I array
is the SARRAY field. The
.IR n th
sample of the SINDIR field is the string value of the
.IR m th
element of
.IR array
(counting from zero), where
.I m
is the value of the
.IR n th
sample of
.IR index .
When
.I index
is not a valid element number of
.IR array ,
the corresponding value of the SINDIR is implementation dependent. SINDIR
appeared in Standards Version 10.
.RE
.TP
.B STRING
The STRING scalar field type is a character string fully specified in the format
file metadata. Syntax is:
.RS
.IP
.I <fieldname>
.B STRING
.I <string>
.PP
where
.I string
is the string value of the field. Note that
.I string
is a single token. To include whitespace in the string, enclose
.I string
in quotation marks
.RB ( """" ),
or else escape the whitespace with the backslash character
.RB ( \e ).
STRING appeared in Standards Version 6.
.RE
.TP
.B WINDOW
The WINDOW vector field type isolates a portion of an input vector based on a
comparison. Syntax is:
.RS
.IP
.I <fieldname>
.B WINDOW
.I <input> <check> <op> <threshold>
.PP
where
.I input
is the vector containing the data to extract,
.I check
is the vector on which to test the comparison,
.I threshold
is the value against which
.I check
is compared, and
.I op
is one of the following tokens indicating the particular comparison performed:
.RS
.TP
.I EQ
data are extracted where
.IR check ,
converted to a 64-bit signed integer, equals
.IR threshold ,
.TP
.I GE
data are extracted where
.IR check ,
converted to a 64-bit floating-point number, is greater than or equal to
.IR threshold ,
.TP
.I GT
data are extracted where
.IR check ,
converted to a 64-bit floating-point number, is strictly greater than
.IR threshold ,
.TP
.I LE
data are extracted where
.IR check ,
converted to a 64-bit floating-point number, is less than or equal to
.IR threshold ,
.TP
.I LT
data are extracted where
.IR check ,
converted to a 64-bit floating-point number, is strictly less than
.IR threshold ,
.TP
.I NE
data are extracted where
.IR check ,
converted to a 64-bit signed integer, is not equal to
.IR threshold ,
.TP
.I SET
data are extracted where at least one bit set in
.IR threshold
is also set in
.IR check ,
when converted to a 64-bit unsigned integer,
.TP
.I CLR
data are extracted where at least one bit set in
.IR threshold
is not set in
.IR check ,
when converted to a 64-bit unsigned integer,
.RE
.PP
The storage type of
.I threshold
depends on the operator, and follows the interpretation of
.IR check .
It may never be complex valued.
.PP
Outside the region extracted, the value of the derived field is implementation
dependent.
.PP
Note: with the
.B EQ
operator, this derived field type is very similar to the MPLEX field type above.
The primary difference is that MPLEX mandates the value of the derived field
outside the extracted region, while WINDOW does not. WINDOW appeared in
Standards Version 9.
.RE
.SS Field Parameters
All input vector field parameters should be
.I field codes
(see below). Additionally, the scalar field parameters listed may be either
literal numbers or else the
.I field code
of a
.B CONST
field containing the value, or the
.I field code
of a
.B CARRAY
followed by a left angle bracket
.RI ( < ),
then an non-negative integer used as the
.B CARRAY
element index, then a right angle bracket
.RI ( > ),
that is:
.IP
.IB fieldcode < n >
.PP
If the angle
brackets and element index are omitted from a
.B CARRAY
field code used as a parameter, the first element in the field (index zero) is
assumed.
.PP
Field parameters which may be specified using a scalar field code are:
.RS
.TP
.BR BIT ", " SBIT
.IR bitnum ", " numbits
.TP
.B LINCOM
any of the
.IR m "i, or " b i
.TP
.B MPLEX
.IR count ", " max
.TP
.B PHASE
.I shift
.TP
.B POLYNOM
any of the
.IR a i
.TP
.B RAW
.I spf
.TP
.B RECIP
.I dividend
.TP
.B WINDOW
.I threshold
.RE
.PP
Since it is possible to create a field code which is identical to a literal
number, a parameter is assumed to be the field code of a scalar field only if
the entire token cannot be parsed as a literal number using the rules outlined
in
.BR strtod (3).
For example, a
.B CONST
field whose field code consists solely of digits can never be used as a
parameter in a field specification line.
Starting in Standards Version 7, literal complex number is specified as two
real (floating point) numbers separated by a semicolon
.RB ( ; )
with no intervening whitespace. So, for example, the tokens
.IP
1;0 \t 0;1 \t 4;0 \t 0;5 \t 9.313e2;74.1
.PP
represent, respectively, the real unit, the imaginary unit, the real number
four, the imaginary number
.RI 5 i ,
and the complex number
.RI "931.3 + 74.1" i .
Because the semicolon character cannot be used in field names, a complex valued
literal can never be mistaken for a field code. This allows, among other
things, the composition of complex valued fields from purely real input fields.
For example, a complex valued field,
.IR z ,
may be created from a real valued field
.IR re ,
representing the real part of the complex number, and the real valued field
.IR im ,
representing the imaginary part of the complex number, with the following
.B LINCOM
specification:
.IP
.I z
.B LINCOM
.I re
1 0
.I im
0;1 0
.PP
Starting in Standards Version 9, in additional to decimal notation, literal
integer parameters may be specified as hexadecimal numbers, by prefixing the
number (after an optional
.RB ' + '
or
.RB ' - '
sign) with
.B 0x
or
.BR 0X ,
or as octal numbers, by prefixing the number with
.BR 0 ,
as described in
.BR strtol (3).
Similarly, floating point literal numbers (both purely real ones and
components of complex literals) may be specified in hexadecimal by prefixing
them with
.B 0x
or
.BR 0X ,
and using
.B p
or
.B P
as the binary exponent prefix, as described in the C99 standard. Both uppercase
and lowercase hexadecimal digits may be used. In cases where a literal
floating point number may apear, the tokens
.B INF
or
.BR INFINITY ,
optionally preceded by a
.RB ' + '
or
.RB ' - '
sign, and
.BR NAN ,
optionally immediately followed by
.RB ' ( ',
then a sequence of characters, then
.RB ' ) ',
and all disregarding case, will be interpreted as the special floating point
values explained in
.BR strtod (3).
.SS Field Codes
When specifying the input to a field, either as a scalar parameter, or as an
input vector field to a
.RB non- RAW
vector field,
.I field codes
are used. A
.I field code
consists of, in order:
.IP \(bu 4
(since Standards Version 10:) optonally, a leading dot
.RB ( . ),
indicating this field code is relative to the fragment's root namespace.
Without the leading dot, the field code is taken to be relative to the current
namespace. (See the discussion in the
.B Namespaces
section above for details.)
.IP \(bu 4
(since Standards Version 10:) optionally, a non-null
.I subnamespace
followed by a dot
.RB ( . )
indicating a subspace under the current or root namespace. The subnamespace may
be made up of any number of namespace tags separated by dots, to nest deeper in
the namespace tree.
.IP \(bu 4
(since Standards Version 6:) if the field in question is a metafield
(see the
.B /META
directive above), the field name of the metafield's parent (which may be an
alias) followed by a forward slash
.RB ( / ).
.IP \(bu 4
a simple field name, possibly an alias, indicating a vector or scalar field
.IP \(bu 4
(since Standards Version 7:) optionally, a dot
.RB ( . )
followed by a
.IR "representation suffix" .
.PP
A
.IR "representation suffix"
may be used used to extract a real number from a complex value. The available
suffixes (listed here with their preceding dot) and their meanings are:
.TP
.B .a
the argument of the input, that is, the angle (in radians) between the positive
real axis and the input. The argument is in the range [-pi, pi], and a branch
cut exists along the negative real axis. At the branch cut, -pi is returned if
the imaginary part is -0, and pi is returned if the imaginary part is +0. If
the input is zero, zero is returned.
.TP
.B .i
the imaginary part of the input
.RI ( i.e. \~the
projection of the input onto the imaginary axis)
.TP
.B .m
the modulus of the input
.RI ( i.e. \~its
absolue value).
.TP
.B .r
the real part of the input
.RI ( i.e. \~the
projection of the input onto the real axis)
.TP
.B .z
(since Standards Version 10:) the identity representation: it returns the full
complex value, equivalent to simply omitting the suffix completely. It is only
needed in certain cases to force the correct interpretation of a field code in
the presence of a namespace tag. To wit, the field code
.RS
.IP
name.r
.PP
may be interpreted as the real-part (via the
.B .r
representation suffix)
of the field called
.IR name .
(if such a field exists). To refer to a field called
.I r
in the
.I name
namespace, the field code must be written:
.IP
name.r.z
.PP
NB: The first interpretation only occurs with valid representation suffixes; the
field code:
.IP
name.q
.PP
is interpreted as the field
.I q
in the
.I name
namespace because
.B .q
is not a valid representation suffix. Furthermore, ambiguity arises only if
both fields "name" and "name.r" are defined. if the field "name" does
not exist, but the field "name.r" does, then the original field code is not
ambiguous. This is the only representation suffix allowed on
.BR SARRAY ,
.BR SINDIR ,
and
.BR STRING
field codes.
.RE
.PP
If the specified field is purely real, representations are calculated as
if the imaginary part were equal to +0.
.SH HISTORY
This document describes Versions 10 and earlier of the Dirfile Standards.
Version 10 of the Standards (January 2017) added the
.BR INDIR ", " SARRAY ,
and
.B SINDIR
field types, namespaces, the
.B /NAMESPACE
directive, the
.B flac
encoding scheme, and the
.I .z
representation suffix.
Version 9 of the Standards (April 2012) added the
.B MPLEX
and
.B WINDOW
field types, the
.B /ALIAS
and
.B /HIDDEN
directives, the affixes to
.BR /INCLUDE ,
the
.BR sie ", " zzip ,
and
.B zzslim
encoding schemes, along with the optional
.I enc_datum
token to
.BR /ENCODING .
It permitted specification of integer literals in octal and hexadecimal.
Finally, it deprecated the type aliases
.I FLOAT
and
.IR DOUBLE .
Version 8 of the Standards (November 2010) added the
.BR DIVIDE ", " RECIP ,
and
.B CARRAY
field types, made the forward slash on reserved words mandatory, and prohibited
using the single-character type aliases in the specification of
.B RAW
fields. It also introduced the optional second
.RI ( arm )
token to the
.B /ENDIAN
directive.
Version 7 of the Standards (October 2009) added the
.B SBIT
and
.B POLYNOM
field types, and the directive-less method of specifying metafields. It also
introduced the data types
.I COMPLEX128
and
.IR COMPLEX64 ,
along with the notion of
.IR representations ,
and the
.B lzma
encoding scheme. Finally, it made the number of fields parameter for
.I LINCOM
optional.
Version 6 of the Standards (October 2008) added the
.BR /ENCODING ", " /META ", " /PROTECT ", and " /REFERENCE
directives, and the
.B CONST
and
.B STRING
field types. It permitted whitespace in tokens and introduced the character
escape sequences. It allowed
.B CONST
fields to be used as parameters in field specification lines. It also removed
.I FILEFRAM
as an alias for
.IR INDEX ,
and prohibited
.BR .
but allowed
.B #
and
.B \e
in field names.
Version 5 of the Standards (August 2008) added
.B VERSION
and
.BR ENDIAN ,
slash demarcation of reserved words, and removed the restriction on field
name length. It introduced the data types
.IR INT8 ", " INT64 ,
and
.IR UINT64 ,
the new-style type specifiers, and increased the range of the
.B BIT
field type from 32 to 64 bits. It also prohibited the characters
.B &;<>\e|
in field names.
Version 4 of the Standards (October 2006) added the
.B PHASE
field type.
Version 3 of the Standards (January 2006) added
.B INCLUDE
and increased the allowed length of a field name from 16 to 50 characters.
Version 2 of the Standards (September 2005) added the
.B MULTIPLY
field type.
Version 1 of the Standards (November 2004) added
.B FRAMEOFFSET
and the optional fourth argument to the
.B BIT
field type.
Version 0 of the Standards (before March 2003) refers to the dirfile standards
supported by the
.BR getdata (3)
library originally introduced into the
.BR kst (1)
sources, which contained support for all other features covered by this
document.
.SH AUTHORS
The dirfile specification was developed by C. B. Netterfield
.nh
<netterfield@astro.utoronto.ca>.
.hy 1
Since Standards Version 3, the dirfile specification has been maintained by
D. V. Wiebe
.nh
<getdata@ketiltrout.net>.
.hy 1
.SH SEE ALSO
.BR dirfile (5),
.BR dirfile\-encoding (5)
|