File: codestyle.tex

package info (click to toggle)
infernal 1.1.5-3
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 74,208 kB
  • sloc: ansic: 230,749; perl: 14,433; sh: 6,147; makefile: 3,071; python: 1,247
file content (1804 lines) | stat: -rw-r--r-- 69,606 bytes parent folder | download | duplicates (5)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804

This chapter describes Easel from a developer's perspective. It shows
how a module's source code is organized, written, tested, and
documented. It should help you with implementing new Easel code, and
also with understanding the structure of existing Easel code.

We expect Easel to constantly evolve, both in code and in style.
Talking about our code style does not mean we enforce foolish
consistency. Rather, the goal is aspirational; one way we try to
manage the complexity of our growing codebase is to continuously
cajole Easel code toward a clean and consistent presentation. We try
to organize code modules in similar ways, use certain naming
conventions, and channel similar functions towards common
\esldef{interfaces} that provide common calling conventions and
behaviors.

But because it evolves, not all Easel code obeys the code style
described in this chapter. Easel code style is like a local building
ordinance. Any new construction should comply. Older construction is
grandfathered in and does not have to immediately conform to the
current rules. When it comes time to renovate, it's also time to bring
the old work up to the current standards.

For a concrete example we will focus primarily on one Easel module,
the \eslmod{buffer} module. We'll take a bottom up approach, starting
from the overall organization of the module and working down into
details. If you're a starting developer, you might have preferred a
bottom-up description; you might just want to know how to write or
improve a single Easel function, for example. In that case, skim
ahead.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Table: Easel naming conventions
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{table}
\begin{minipage}{\textwidth}
\begin{tabular}{l>{\raggedright}p{3.5in}l}
\textbf{What}        & \textbf{Explanation}              & \textbf{Example} \\ \hline
Easel module
  &
    Module names should be 10 characters or less.\footnote{sqc assumes
    this in output formatting, for example.}
    Many modules are organized around a single Easel object
    that they implement. The name of the module matches the
    name of the object. For example, \ccode{esl\_buffer.c} implements \ccode{ESL\_BUFFER}.
  & \eslmod{buffer} \\ \\

tag name
  & Names in the module are constructed either using the module's full
    name or sometimes with a shorter abbreviation, usually 3
    characters (sometimes 2 or 4).
  & \ccode{buf} \\ \\

source file
  & Each module has one source file, named \ccode{esl\_}\itcode{modulename}\ccode{.c}.
  & \ccode{esl\_buffer.c} \\ \\

header file
  & Each module has one header file, named \ccode{esl\_}\itcode{modulename}\ccode{.h}.
  & \ccode{esl\_buffer.h} \\  \\

documentation 
  & Each module has one documentation chapter, named \ccode{esl\_}\itcode{modulename}\ccode{.tex}.
  & \ccode{esl\_buffer.tex} \\ \\

Easel object          
  & Easel ``objects'' are typedef'ed C structures (usually) or
    types (rarely\footnote{\ccode{ESL\_DSQ} is a \ccode{uint8\_t}, for example.}).
  & \ccode{ESL\_BUFFER} \\ \\  

external function 
  & All exposed functions have tripartite names \ccode{esl\_}\itcode{module}\ccode{\_specificname}().
    The specific part of function names often adhere to a standardized API
    ``interface'' nomenclature. (All \ccode{\_Open()} functions must follow the same standardized
    behavior guidelines, for example.) Functions in the base \ccode{easel.c} module
    have a bipartite name, omitting the module name. The specific 
    name part generally uses mixed case capitalization.
  & \ccode{esl\_buffer\_OpenFile()} \\ \\

static function 
  & Internal functions (static within a module file) drop the
    \ccode{esl\_} prefix, and are 
    named \itcode{modulename}\ccode{\_function}.
  & \ccode{buffer\_refill()} \\ \\

macro 
  & Macros follow the same naming convention as external functions,
    except they are all upper case.
  & \ccode{ESL\_ALLOC()} \\ \\ 

defined constant
  & Defined constants in Easel modules are named
    \ccode{esl}\itcode{MODULENAME}\ccode{\_FOO}. Constants defined
    in the base \ccode{easel.h} module are named just 
    \ccode{eslFOO}.
   & \ccode{eslBUFFER\_SLURPSIZE}\\ \\

return codes
  & Return codes are constants defined in \ccode{easel.h}, so 
    they obey the rules of other defined constants in the base module (\ccode{eslOK},
    \ccode{eslFAIL}). Additionally, error codes start with
    \ccode{E}, as in \ccode{eslE}\itcode{ERRTYPE}.
  & \ccode{eslENOTFOUND} \\ \\

config constant
  & Constants that don't start with \ccode{esl} are almost always 
    configuration (compile-time) constants determined by the autoconf
    \ccode{./configure} script and defined in \ccode{esl\_config.h}.
  & \ccode{HAVE\_STDINT\_H} \\ \\
\end{tabular}
\end{minipage}
\caption{\textbf{Easel naming conventions.} }
\end{table}



%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{An Easel module}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Each module consists of three files: a .c C code file, a .h header
file, and a .tex documentation file. These filenames are constructed
from the module name. For example, the \eslmod{buffer} module is
implemented in \ccode{esl\_buffer.c}, \ccode{esl\_buffer.h}, and
\ccode{esl\_buffer.tex}.

%%%%%%%%%%%%%%%%
\subsection{The .c file}
%%%%%%%%%%%%%%%%

Easel \ccode{.c} files are larger than most coding styles would
advocate. Easel module code is designed to be \emph{read}, to be
\emph{self-documenting}, to contain its own \emph{testing methods},
and to provide useful \emph{working examples}.  Thus the size of the
files is a little deceptive, compared to C code that's solely
implementating some functions. In general, only about a a quarter of
an Easel module's \ccode{.c} file is the actual module implementation.
Typically, around half of an Easel \ccode{.c} file is documentation,
and much of this gets automatically parsed into the PDF userguide. The
rest consists of drivers for unit testing and examples.

Module files are organized into a somewhat stereotypical set of
sections, to facilitate navigating the code, as follows.

The \ccode{.c} file starts with a comment that contains the {\bfseries
  table of contents}. The table of contents helps us navigate a long
Easel source file. This initial comment also includes a short
description of the module's purpose. It may also contain miscellaneous
notes.

For example, from the \eslmod{buffer} module:

\input{cexcerpts/header_example}

None of this is parsed automatically. Its structure is just
convention.

The short description lines in the table of contents match section
headings in comments later in the file. A search forward with the text
of a heading will move you to that section of the code.

Next come the {\bfseries includes} and any {\bf definitions}. Of the
include files, the \ccode{esl\_config.h} header must always be
included first. It contains platform-independent configuration code
that may affect even the standard library header files. Standard
headers like \ccode{stdio.h} come next, then Easel's main header
\ccode{easel.h}; then headers of any other Easel modules this module
depends on, then the module's own header. For example, the
\ccode{\#include}'s in the \eslmod{buffer} module look like:

\input{cexcerpts/include_example}

Next come the {\bfseries private function declarations}.  We declare
all private functions at the top of the file, where they can be seen
easily by a developer who's casually reading the source. Their
definitions are buried deeper, in one or more sections following the
implementation of the exposed API.

\input{cexcerpts/statics_example}

The rest of the file is the {\bfseries code}. It is split into
sections. Each section is numbered and given one-line titles that
appear in the table of contents.  Each section starts with a section
header, a comment block in front of each code section in the
\ccode{.c} file.  These section headers match comments in front of
that section's declarations in the \ccode{.h} file. Because of the
numbering and titling, a particular section of code can be located by
searching on the number or title.  A common section structure includes
the following, in this order:


\begin{description}
\item[\textbf{The \ccode{FOOBAR} object.}]
  The first section of the file provides the API for creating and
  destroying the object that this module implements.

\item[\textbf{The rest of the API.}]
  Everything else that is part of the API for this module.
  This might be split across multiple sections.

\item[\textbf{Debugging/dev code.}]
  Most objects can be validated or dumped to an output stream
  for inspection.

\item[\textbf{Private functions.}]
  Easel isn't rigorous about where private (non-exposed) functions go,
  but they often go in a separate section in about the middle of the
  \ccode{.c} file, after the API and before the drivers.

\item[\textbf{Optional drivers}] Stats, benchmark, and regression
  drivers, if any. 

\item [\textbf{Unit tests.}]
  The unit tests are internal controls that test that the module's API
  works as advertised.

\item [\textbf{Test driver.}]
  All modules have an automated test driver is a \ccode{main()} that
  runs the unit tests.
 
\item [\textbf{Examples.}]
  All modules have at least one \ccode{main()} showing an example of
  how to use the main features of the module.

\end{description}

%%%%%%%%%%%%%%%%
\subsection{The .h file}
%%%%%%%%%%%%%%%%


%%%%%%%%%%%%%%%%
\subsection{Special syntax in Easel C comments}
%%%%%%%%%%%%%%%%

Easel comments sometimes include special syntax recognized by tools other
than the compiler.  Here are some quick explanations of the special
stuff a developer needs to be aware of. 

\begin{table}
\begin{tabular}{l>{\raggedright}p{3.5in}l}
\textbf{Special syntax}  & \textbf{Description}  & \textbf{Parsed by}\\ \hline

\ccode{/* Function: }\itcode{funcname} 
  & Function documentation that gets converted to \LaTeX\ and included
    in Easel's PDF documentation.
  & \emcode{autodoc} \\ \\

\ccode{ *\# }\itcode{x.\ secheading} 
  & Section heading corresponding to section number x in a \ccode{.c}
    file's table of contents. This is automatically extracted as part
    of creating a summary table in the PDF documentation.
  & \emcode{autodoc -t} \\ \\

\ccode{/*::cexcerpt::} ...
  & Comments that marking beginning/end of code that is extracted
    verbatim into the documentation.
  & \emcode{cexcerpt} \\ \\

\hline
\end{tabular}
\caption{{\bfseries Summary of special syntax in Easel C comments.}}
\end{table}

%%%%
\subsubsection{function documentation}
%%%%

Any comment that starts with
\begin{cchunk}
/* Function:  ...
\end{cchunk}
will be recognized and parsed by our \prog{autodoc} program, 
which assumes it is looking at a structured function documentation
header.

See section XX for details on how these headers work.

We want all external functions in the Easel API to be documented
automatically by \prog{autodoc}. We don't want internal functions tp
appear in the documentation, but we do want them documented in the
code.  To keep \prog{autodoc} from recognizing the function header of
an internal (static) function, we just leave off the \ccode{Function:}
tag in the comment block.   

%%%%
\subsubsection{section headings}
%%%%

The automatically generated \LaTeX\ code for a module's documentation
includes a table summarizing the functions in the exposed API. This
table is constructed automatically from the source code by
\prog{autodoc -t}. The list of functions in this table is extracted
from the function documentation (above). The table is broken into
sections, just as the module code is, using section headings. The
comment block marking the start of a section heading for exposed API
code has an extra \ccode{\#}:

\begin{cchunk}
/*****************************************************************
 *# 1. ESL_BUFFER object: opening/closing.
 *****************************************************************/
\end{cchunk}

Section headings for internal functions omit the \ccode{\#}, and
\prog{autodoc} ignores them:

\begin{cchunk}
/*****************************************************************
 * 10. Unit tests
 *****************************************************************/
\end{cchunk}

%%%%
\subsubsection{excerpting}
%%%%

This book includes many examples of C code extracted verbatim from
Easel source.  These {\bfseries excerpts} are marked with specially
formatted comments in the C file:

\begin{cchunk}
/*::cexcerpt::my_example::begin::*/
   while (esl_sq_Read(sqfp, sq) == eslOK)
     { n++; }
/*::cexcerpt::my_example::end::*/
\end{cchunk}

When we build the Easel documentation from its source, our
\prog{cexcerpt} program extracts all marked excerpts from \ccode{.c}
and \ccode{.h} files, and places them in individual files in a
temporary \ccode{cexcerpts/} directory, from where they are included
in the main \LaTeX documentation.



%%%%%%%%%%%%%%%%
\subsection{Driver programs}
%%%%%%%%%%%%%%%%

An unusual (innovative?) thing about Easel modules is how we embed
{\bfseries driver programs} directly in the module's \ccode{.c}
file. Driver programs include our unit tests, benchmarks, and working
examples. These small programs are enclosed in standardized
\ccode{\#ifdef}'s that enable them to be conditionally compiled.

None of these programs are installed by \ccode{make install}.  Test
drivers are compiled as part of \ccode{make check}.  A \ccode{make
  dev} compiles all driver programs.

There are six main types of drivers used in Easel:

\begin{description} 

\item[\textbf{Unit test driver(s).}] (Mandatory.) Each module has one (and only one)
  \ccode{main()} that runs the unit tests and any other automated for
  the module. The test driver is compiled and run by the testsuite in
  \ccode{testsuite/testsuite.sqc} when one does a \ccode{make check}
  on the package. It is also run by several of the automated tools
  used in development, including the coverage (\ccode{gcov}) and
  memory (\ccode{valgrind}) tests. A test driver takes no arguments
  (it must generate any input files it needs). If it succeeds, it
  returns 0, with no output. If it fails, it returns nonzero and calls
  \ccode{esl\_fatal()} to issue a short error message on
  \ccode{stdout}. Our test harness, \emcode{sqc}, depends on these
  output and exit status conventions. Optionally, it may use a flag
  to show more useful output when it's run more interactively.
  (usually a \ccode{-v}, for verbose).
  The test driver is enclosed by
  \ccode{\#ifdef esl}\itcode{MODULE}\ccode{\_TESTDRIVE} for
  conditional compilation.

\item[\textbf{Regression/comparison test(s).}] (Optional.) These tests
  link to one or more libraries that provide identical comparable
  functionality, such as previous versions of Easel, the old
  \prog{SQUID} library, \prog{LAPACK} or the GNU Scientific Library.
  They test that Easel's functionality performs at least as it used
  to, or as well as the 'competition'. These tests are run on demand,
  and not included in automated testing, because the other libraries
  may only be present on a subset of our development machines. They
  are enclosed by \ccode{\#ifdef
    esl}\itcode{MODULE}\ccode{\_REGRESSION} for conditional
  compilation.

\item[\textbf{Benchmark(s).}] (Optional.) These tests run a
  standardized performance benchmark and collect time and/or memory
  statistics. They may generate output suitable for graphing. They are
  run on demand, not by automated tools. They typically use 
  \eslmod{stopwatch} for timing. They are enclosed by
  \ccode{\#ifdef esl}\itcode{MODULE}\ccode{\_BENCHMARK}  for
  conditional compilation.

\item[\textbf{Statistics generator(s).}] (Optional.) These tests collect
  statistics used to characterize the module's scientific performance,
  such as its accuracy at some task. They may generate graphing
  output. They are run on demand, not by automated tools. They are
  enclosed by \ccode{\#ifdef esl}\itcode{MODULE}\ccode{\_STATS}
  for conditional compilation.

\item[\textbf{Experiment(s).}] (Optional.) These are other reproducible
  experiments we've done on the module code, essentially the same as
  statistics generators. They are
  enclosed by \ccode{\#ifdef esl}\itcode{MODULE}\ccode{\_EXPERIMENT}
  for conditional compilation.

\item[\textbf{Example(s).}] (Mandatory). Every module has at least one example
  \ccode{main()} that provides a ``hello world'' level example of
  using the module's API. Examples are enclosed in \ccode{cexcerpt}
  tags for extraction and verbatim inclusion in the documentation.
  They are enclosed by \ccode{\#ifdef esl}\itcode{MODULE}\ccode{\_EXAMPLE} 
  for conditional compilation.
\end{description}  

All modules have at least one test driver and one example. Other tests
and examples are optional. When there is more than one \ccode{main()}
of a given type, the additional tags are numbered starting from 2: for
example, a module with three example \ccode{main()'s} would have three
tags for conditional compilation, \ccode{eslFOO\_EXAMPLE},
\ccode{eslFOO\_EXAMPLE2}, and \ccode{eslFOO\_EXAMPLE3}.

The format of the conditional compilation tags for all the drivers
(including test and example drivers) must be obeyed. Some test scripts
are scanning the .c files and identifying these tags
automatically. For instance, the driver compilation test identifies any
tag named
\ccode{esl}\itcode{MODULENAME}\ccode{\_\{TESTDRIVE,EXAMPLE,REGRESSION,BENCHMARK,STATS\}*}
and attempt to compile the code with that tag defined.

Which driver is compiled (if any) is controlled by conditional
compilation of the module's \ccode{.c} file with the appropriate
tag. For example, to compile and run the \eslmod{sqio} test driver as
a standalone module:

\begin{cchunk}
   %  gcc -g -Wall -I. -o esl_sqio_utest -DeslSQIO_TESTDRIVE esl_sqio.c easel.c -lm
   %  ./esl_sqio_utest
\end{cchunk}

or to compile and run it in full library configuration:

\begin{cchunk}
   %  gcc -g -Wall -I. -L. -o esl_sqio_utest -DeslSQIO_TESTDRIVE esl_sqio.c -leasel -lm
   %  ./esl_sqio_utest
\end{cchunk}


\begin{table}
\begin{tabular}{llll}
\textbf{Driver type}     &  \textbf{Compilation flag}                       & \textbf{Driver program name}                     & \textbf{Notes}\\ \hline
Unit test                &  \ccode{esl}\itcode{MODULE}\ccode{\_TESTDRIVE}   & \ccode{esl\_}\itcode{module}\ccode{\_utest}      & output and exit status standardized for \emcode{sqc}\\
Regression test          &  \ccode{esl}\itcode{MODULE}\ccode{\_REGRESSION}  & \ccode{esl\_}\itcode{module}\ccode{\_regression} & may require other libraries installed\\
Benchmark                &  \ccode{esl}\itcode{MODULE}\ccode{\_BENCHMARK}   & \ccode{esl\_}\itcode{module}\ccode{\_benchmark}  & \\
Statistics collection    &  \ccode{esl}\itcode{MODULE}\ccode{\_STATS}       & \ccode{esl\_}\itcode{module}\ccode{\_stats}      & \\
Experiment               &  \ccode{esl}\itcode{MODULE}\ccode{\_EXPERIMENT}  & \ccode{esl\_}\itcode{module}\ccode{\_experiment} & \\
Example                  &  \ccode{esl}\itcode{MODULE}\ccode{\_EXAMPLE}     & \ccode{esl\_}\itcode{module}\ccode{\_example}    & \\
\end{tabular}
\caption{{\bfseries Summary of types of driver programs in Easel.}}
\end{table}









%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Writing an Easel function}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


Documentation of functions, particularly in the structured comment
header that's parsed by the \emcode{autodoc} program, is described in
a different section of its own.

%%%%
\subsubsection{conventions for function names}
%%%%

Function names are tripartite, constructed as
\ccode{esl\_}\itcode{moduletag\_funcname}.  

The \itcode{moduletag} should generally be the module's full name;
sometimes (historically) it is an abbreviated tag name for the module
(such as \ccode{abc} for the \eslmod{alphabet} module); on occasion,
it is the name of an Easel object or datatype that has not yet budded
off into its own module. Long versus short \itcode{moduletag}'s are
sometimes used to indicate functions that operate directly on objects
via common interfaces, versus other functions in the exposed API. The
long form may indicate functions that obey a common interface, such as
\ccode{esl\_alphabet\_Create()}.\footnote{This is a clumsy C version
  of what C++ would do with namespaces, object methods, and
  constructors/destructors.} Miscellaneous exposed functions in the API
  of a module may be named by the three-letter short tag, such as
  \ccode{esl\_abc\_Digitize()}.

The function's \ccode{\{funcname\}} can be anything. Some names
are standard and indicate the use of a common {\bfseries interface}.
This part of the name is usually in mixed-case capitalization.

Only exposed (\ccode{extern}) functions must follow these rules. In
general, private (\ccode{static}) functions can have any
name. However, it's common in Easel for private functions to obey the
same naming conventions except without the \ccode{esl\_} prefix.

Sometimes essentially the same function must be provided for different
data types. In these cases one-letter prefixes are used to indicate
datatype:

\begin{tabular}{ll}
\ccode{C} & \ccode{char} type, or a standard C string \\
\ccode{X} & \ccode{ESL\_DSQ} type, or an Easel digitized sequence\\
\ccode{I} & \ccode{int} type \\
\ccode{F} & \ccode{float} type \\
\ccode{D} & \ccode{double} type \\
\end{tabular}

For example, \eslmod{vectorops} uses this convention heavily;
\ccode{esl\_vec\_FNorm()} normalizes a vector of floats and
\ccode{esl\_vec\_DNorm()} normalizes a vector of doubles.  A second
example is in \eslmod{randomseq}, which provides routines for shuffling
either text strings or digitized sequences, such as
\ccode{esl\_rsq\_CShuffle()} and \ccode{esl\_rsq\_XShuffle()}.

%%%%
\subsubsection{conventions for argument names}
%%%%

When using pointers in C, it can be hard to tell which arguments are
for input data (which are provided by the caller and will not be
modified), output data (which are created and returned by the
function), and modified data (which are both input and output).  

For output consisting of pointers to nonscalar types such as objects
or arrays, it also can be hard to distinguish when the caller is
supposed to provide pre-allocated storage for the result, versus the
storage being newly allocated by the function.\footnote{A common
strategy in C library design is to strive for \emph{no} allocation in
the library, so the caller is always responsible for explicit
alloc/free pairs. I feel this puts a tedious burden of allocation code
on an application.}

When functions return more than one kind of result, it is convenient
to make all the individual results optional, so the caller doesn't
have to deal with managing storage for results it isn't interested in.
In Easel, an optional result pointer is passed as \ccode{NULL} to
indicate a possible result is not wanted (and is not allocated, if
returning that result required new allocation).

Easel uses a prefix convention on pointer argument names to indicate
these situations:

\begin{table}[h]
\begin{center}
{\small
\begin{tabular}{cp{2.5in}p{3in}}
 \textbf{prefix} &  \textbf{argument type}                  & \textbf{allocation (if any):}\\
none           & If qualified as \ccode{const}, a pointer
                 to input data, not modified by the call. 
                 If unqualified, a pointer to data modified
                 by the call (it's both input and output). & by caller\\ 
\ccode{ret\_}  & Pointer to result.                        & in the function \\
\ccode{opt\_}  & Pointer to optional result.               
                 If non-\ccode{NULL}, result is obtained. & in the function \\
\end{tabular}
}
\end{center}
\end{table}



%%%%
\subsubsection{Return status}
%%%%

%%%%
\subsubsection{conventions for exception handling}
%%%%

Easel functions {\bfseries should never exit except through an Easel
  return code or through the Easel exception handler}. When you write
Easel code you must {\bfseries always} deal with the case when the
caller has registered a nonfatal exception handler, causing thrown
exceptions to return a nonzero code rather than exiting. The Easel
library is designed to be used in programs that can't just suddenly
crash out with an error message (such as a graphical user interface
environment), and programs that have specialized error handlers
because they don't even have access to a \ccode{stderr} stream on a
terminal (such as a UNIX daemon).

This means that Easel functions must clean up their memory and set
appropriate return status and return arguments, even in the case of
thrown exceptions.


%%%%
\subsubsection{Easel's idiomatic function structure}
%%%%

To deal with the above strictures of return status, returned
arguments, and exception handling and cleanup, most Easel functions
follow an idiomatic structure.  The following snippet illustrates the
key ideas:

\begin{cchunk}
1    int
2    esl_example_Hello(char *opt_hello, char *opt_len)
3    {
4      char *msg = NULL;
5      int   n;
6      int   status;

7      if ( (status = esl_strdup("hello world!\n", -1, &msg)) != eslOK) goto ERROR;
8      n = strlen(msg);

9      if (opt_hello) *opt_hello = msg; else free(msg);
10     if (opt_len)   *opt_len   = n;
11     return eslOK;

12  ERROR:
13     if (msg)        free(msg);
14     if (opt_hello) *opt_hello = NULL;
15     if (opt_n)     *opt_n     = 0;
16     return status;
17  }
\end{cchunk}

The stuff to notice here:

\begin{itemize}
\item[line 2:] The \ccode{opt\_hello} and \ccode{opt\_len} arguments
  are optional. The caller might want only one of them (or neither,
  but that would be weird). We're expecting calls like
  \ccode{esl\_example\_Hello(\&hello, \&n)},
  \ccode{esl\_example\_Hello(\&hello, NULL)}, or
  \ccode{esl\_example\_Hello(NULL, \&n)}.

\item[line 4:] Anything we allocate, we initialize its pointer to \ccode{NULL}. 
  Now, if an exception occurs and we have to break out of the function early,
  we can tell whether the allocation has already happened (and hence we need
  to clean up its memory), if the pointer has become non-\ccode{NULL}.

\item[line 6:] Most functions have an explicit \ccode{status} variable.
  Standard error-handling macros (\ccode{ESL\_XEXCEPTION()} for example) expect it to be present,
  as do standard allocation macros (\ccode{ESL\_ALLOC()} for example).
  If we have to handle an exception, we're going to make sure the status
  is set how we want it, then jump to a cleanup block.

\item[line 7:] When any Easel function calls another Easel function,
  it must check the return status for both normal errors and thrown
  exceptions. If an exception has already been thrown by a callee,
  usually the caller just relays the exception status up the call
  stack. The idiom is to set the return \ccode{status} and go
  immediately to the error cleanup block, \ccode{ERROR:}. We use a
  \ccode{goto} for this, Dijkstra notwithstanding.

\item[lines 9,10:] When we set optional arguments for a normal return,
  we first check whether a valid return pointer was provided. If the
  optional pointer is \ccode{NULL} the caller doesn't want the result,
  and we clean up any memory we need to (line 9).

\item[line 13:] In the error cleanup block, we first free any memory
  that got allocated before the failure point. The idiom of
  immediately initializing all allocated pointers to \ccode{NULL} 
  enables us to tell which things have been allocated or not.

\item[line 14:] When we return from a function with an unsuccessful 
  status, we also make sure that any returned arguments are in 
  a documented ground state, usually \ccode{NULL}'s and \ccode{0}'s.
\end{itemize}

%%%%
\subsubsection{reentrancy: plan for threads}
%%%%

Easel code must expect to be called in multithreaded applications. All
functions must be reentrant. There should be no use of global or
static variables. 





%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Standard Easel function interfaces}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Some function names are shared and have common behaviors across
modules, like \ccode{\_Get*()} and \ccode{\_Set*()} functions.  These
special names are called \esldef{common interfaces}.

\begin{table}
\begin{minipage}{\textwidth}
\begin{tabular}{l>{\raggedright}p{3.0in}ll}
\textbf{Function name}        & \textbf{Description}              & \textbf{Returns} &  \textbf{Example} \\ \hline
 \multicolumn{4}{c}{\bfseries Creating and destroying new objects}\\
\ccode{\_Create}
  & Create a new object.
  & \ccode{ESL\_}\itcode{FOO}\ccode{ *}
  & \ccode{esl\_alphabet\_Create()} \\

\ccode{\_Destroy}
  & Free an object.
  & \ccode{void}
  & \ccode{esl\_alphabet\_Destroy()} \\

\ccode{\_Clone}
  & Duplicate an object, by creating and allocating a new one.
  & \ccode{ESL\_}\itcode{FOO}\ccode{ *}
  & \ccode{esl\_msa\_Clone()} \\

\ccode{\_Shadow}
  & Partially duplicate an object, creating a dependent shadow.
  & \ccode{ESL\_}\itcode{FOO}\ccode{ *}
  & \ccode{p7\_oprofile\_Shadow()} \\

\ccode{\_Copy}
  & Make a copy of an object, using an existing allocated object for space.
  & [standard]
  & \ccode{esl\_msa\_Copy()} \\

 \multicolumn{4}{c}{\bfseries Opening and closing input sources}\\
\ccode{\_Open} 
  & Open an input source, associating it with an Easel object. 
  & [standard]
  & \ccode{esl\_buffer\_Open()} \\

\ccode{\_Close}
  & Close an Easel object corresponding to an input source.
  & [standard]
  & \ccode{esl\_buffer\_Close()} \\

 \multicolumn{4}{c}{\bfseries Managing memory allocation}\\

\ccode{\_Grow}
  & Expand the allocation in an existing object, typically by doubling.
  & [standard]
  & \ccode{esl\_tree\_Grow()} \\

\ccode{\_GrowTo}
  & Reallocate object (if needed) for some new data size.
  & [standard]
  & \ccode{esl\_sq\_GrowTo()} \\

\ccode{\_Reuse}
  & Recycle an object, reinitializing it while reusing as much of its existing
    allocation(s) as possible.
  & [standard]
  & \ccode{esl\_keyhash\_Reuse()} \\

\ccode{size\_t \_Sizeof}
  & Return the allocation size of an object
  & size, in bytes
  & - \\



 \multicolumn{4}{c}{\bfseries Accessing information in objects}\\

\ccode{\_Is}
  & Return \ccode{TRUE} or \ccode{FALSE} for some query of the
    internal state of an object.
  & \ccode{TRUE | FALSE}
  & \ccode{esl\_opt\_IsOn()} \\

\ccode{\_Get}
  & Return a value for some query of the internal state of an object.
  & value
  & \ccode{esl\_buffer\_Get()} \\

\ccode{\_Read}
  & Get a value in the object and return it in a location provided (and possibly allocated) by the caller.
  & [standard]
  & \ccode{esl\_buffer\_Read()} \\

\ccode{\_Fetch}
  & Get a value in the object and return it in newly allocated space;
    the caller becomes responsible for the newly allocated space.
  & [standard]
  & \ccode{esl\_buffer\_FetchLine()} \\  

\ccode{\_Set}
  & Set a value in the object.
  & [standard]
  & \ccode{esl\_buffer\_Set()} \\

\ccode{\_Format}
  & Set a string in the object using \ccode{sprintf()}-like
    semantics.
  & [standard]
  & \ccode{esl\_msa\_FormatName()} \\



 \multicolumn{4}{c}{\bfseries Debugging}\\
\ccode{\_Validate}
  & Run validation tests on the internal state of an object.
  & [standard]
  & \ccode{esl\_tree\_Validate()} \\

\ccode{\_Compare}
  & Compare two objects to each other for equality (or close enough).
  & [standard]
  & \ccode{esl\_msa\_Compare()} \\

\ccode{\_Dump}
  & Dump a verbose, possibly ugly, but developer-readable output 
    of the internal state of an object.
  & [standard]
  & \ccode{esl\_keyhash\_Dump()} \\

\ccode{\_TestSample}
  & Sample a mostly syntactically correct object for test purposes
  & [standard]
  & \ccode{p7\_tophits\_TestSample()} \\



 \multicolumn{4}{c}{\bfseries Miscellaneous}\\

\ccode{\_Write}
  & Write something from an object to an output stream.
  & [standard]
  & \ccode{esl\_msa\_Write()} \\

\ccode{\_Encode}
  & Convert a user-readable string (such as ``fasta'') to an
    internal Easel code (such as \ccode{eslSQFILE\_FASTA}).
  & [standard]
  & \ccode{esl\_msa\_EncodeFormat()} \\

\ccode{\_Decode}
  & Convert an internal Easel code (such as \ccode{eslSQFILE\_FASTA}) 
    to a user-readable string (such as ``fasta'').
  & [standard]
  & \ccode{esl\_msa\_DecodeFormat()} \\
\end{tabular}
\end{minipage}
\caption{\textbf{Standard function ``interfaces''.} }
\end{table}


%%%%%%%%%%%%%%%%
\subsection{Creating and destroying new objects}
%%%%%%%%%%%%%%%%

Most Easel objects are allocated and free'd by
\ccode{\_Create()/\_Destroy()} interface. Creating an object often
just means allocating space for it, so that some other routine can
fill data into it. It does not necessarily mean that the object
contains valid data.

\begin{sreapi}


\hypertarget{ifc:Create} 
{\item[\_Create(n)]}

A \ccode{\_Create()} interface takes any necessary initialization or
size information as arguments (there often aren't any), and it returns a
pointer to the newly allocated object. If an (optional) number of
elements \ccode{n} is provided, this specifies the number of elements
that the object is going to contain (for a fixed-size object) or the
initial allocation size (for a resizable object). In the event of an
allocation failure, a \ccode{\_Create} procedure throws \ccode{NULL}.

(If any error other than an allocation failure can happen, you should
use \ccode{\_Build()} instead. A caller is allowed to assume that a
\ccode{NULL} return from \ccode{\_Create()} is equivalent to
\ccode{eslEMEM}.)

The internals of some resizeable objects have an \ccode{nredline}
parameter that controls an additional memory management rule. These
objects are allowed to grow to arbitrary size (either by doubling with
\ccode{\_Grow} or by a specific allocation with \ccode{\_Reinit} or
\ccode{\_GrowTo}) -- but when the object is reused for new data, they
can be reallocated \emph{downward}, back to the redline
limit. Specifically, if the allocation size exceeds \ccode{nredline},
a \ccode{\_Reuse()} or \ccode{\_Reinit()} call will shrink the
allocation back to the \ccode{nredline} limit.  The idea is for a
frequently-reused object to be able to briefly handle a rare
exceptionally large problem, while not permanently committing the
resizeable object to an extreme allocation size.

At least one module (\ccode{esl\_tree}) allows for creating either a
fixed-size or a resizeable object; in this case, there is a
\ccode{\_CreateGrowable()} call for the resizeable version.




\hypertarget{ifc:Build} 
{\item[\_Build()]}

A \ccode{\_Build()} interface is the same as \ccode{\_Create()}, but
instead of returning a pointer to the new object, we return an Easel
error code, and the new object is returned through a \ccode{*ret\_obj}
argument.





\hypertarget{ifc:Destroy} 
{\item[\_Destroy(obj)]}
A \ccode{\_Destroy()} interface takes an object pointer as an
argument, and frees all the memory associated with it. A
\ccode{\_Destroy} procedure returns \ccode{void} (there is no useful
information to return about a failure; the only calls are to 
\ccode{free()} and if that fails, we're in trouble).
\end{sreapi}

For example:
\begin{cchunk}
   ESL_SQ *sq;
   sq = esl_sq_Create();
   esl_sq_Destroy(sq);
\end{cchunk}




%%%%%%%%%%%%%%%%
  \subsubsection{opening and closing input streams}
%%%%%%%%%%%%%%%%

Some objects (such as \ccode{ESL\_SQFILE} and \ccode{ESL\_MSAFILE})
correspond to open input streams -- usually an open file, but possibly
reading from a pipe. Such objects are \ccode{\_Open()}'ed and
\ccode{\_Close()'d}, not created and destroyed.

Input stream objects have to be capable of handling normal failures,
because of bad user input. Input stream objects contain an
\ccode{errbuf[eslERRBUFSIZE]} field to capture informative parse error
messages. 

\begin{sreapi}
\hypertarget{ifc:Open} 
{\item[\_Open(file, formatcode, \&ret\_obj)]}

Opens the \ccode{file}, which is in a format indicated by
\ccode{formatcode} for reading; return the open input object in
\ccode{ret\_obj}. A \ccode{formatcode} of 0 typically means unknown,
in which case the \ccode{\_Open()} procedure attempts to autodetect
the format. If the \ccode{file} is \ccode{"-"}, the object is
configured to read from the \ccode{stdin} stream instead of opening a
file. If the \ccode{file} ends in a \ccode{.gz} suffix, the object is
configured to read from a pipe from \ccode{gzip -dc}. Returns
\ccode{eslENOTFOUND} if \ccode{file} cannot be opened, and
\ccode{eslEFORMAT} if autodetection is attempted but the format cannot
be determined. 

Newer \ccode{\_Open} procedures return a standard Easel error code,
and on a normal error they also return the allocated object, using the
object's error message buffer to report the reason for the failed
open.

\hypertarget{ifc:Close} 
{\item[\_Close(obj)]}

Closes the input stream \ccode{obj}. Should return a standard Easel
error code. There are cases where an error in an input stream is only
detected at closing time (inputs using \ccode{popen()}/\ccode{pclose()}
  are an example).
\end{sreapi}

For example:
\begin{cchunk}
    char        *seqfile = "foo.fa";
    ESL_SQFILE  *sqfp;

    esl_sqio_Open(seqfile, eslSQFILE_FASTA, NULL, &sqfp);
    esl_sqio_Close(sqfp);
\end{cchunk}


%%%%
  \subsubsection{making copies of objects}
%%%%

\begin{sreapi}

\hypertarget{ifc:Clone}
{\item[\_Clone(obj)]}

Creates and returns a pointer to a duplicate of \ccode{obj}.
Equivalent to (and is a shortcut for, and is generally implemented as)
\ccode{dest = \_Create(); \_Copy(src, dest)}. Caller is responsible
for free'ing the duplicate object, just as if it had been
\ccode{\_Create}'d. Throws \ccode{NULL} if allocation fails.


\hypertarget{ifc:Copy}
{\item[\_Copy(src, dest)]}

Copies \ccode{src} object into \ccode{dest}, where the caller has
already created an appropriately allocated and empty \ccode{dest}
object (or buffer, or whatever). Returns \ccode{eslOK} on success;
throws \ccode{eslEINCOMPAT} if the objects are not compatible (for
example, two matrices that are not the same size).

Note that the order of the arguments is always \ccode{src}
$\rightarrow$ \ccode{dest} (unlike the C library's \ccode{strcpy()}
convention, which is the opposite order).


\hypertarget{ifc:Shadow}
{\item[\_Shadow(obj)]}

Creates and returns a pointer to a partial, dependent copy of
\ccode{obj}. Shadow creation arises in multithreading, when threads
can share some but not all internal object data. A shadow keeps
constant data as pointers to the original object.  The object needs to
know whether it is a shadow or not, so that <\_Destroy()> works
properly on both the original and its shadows.

\end{sreapi}

%%%%%%%%%%%%%%%%
  \subsection{Managing memory allocation}
%%%%%%%%%%%%%%%%

%%%%
  \subsubsection{resizable objects}
%%%%

Some objects need to be reallocated and expanded during their use.
These objects are called \esldef{resizable}.

In some cases, the whole purpose of the object is to have elements
added to it, such as \ccode{ESL\_STACK} (pushdown stacks) and
\ccode{ESL\_HISTOGRAM} (histograms). In these cases, the normal
\ccode{\_Create()} interface performs an initial allocation, and the
object keeps track of both its current contents size (often
\ccode{obj->N}) and the current allocation size (often
\ccode{obj->nalloc}). 

In at least one case, an object might be either growable or not,
depending on how it's being used. This happens, for instance, when we
have routines for parsing input data to create a new object, and we
need to dynamically reallocate as we go because the input doesn't tell
us the total size when we start. For instance, with \ccode{ESL\_TREE}
(phylogenetic trees), sometimes we know exactly the size of the tree
we need to create (because we're making a tree ourselves), and
sometimes we need to create a resizable object (because we're reading a
tree from a file). In these cases, the normal \ccode{\_Create()}
interface creates a static, nongrowable object of known size, and a
\ccode{\_CreateGrowable()} interface specifies an initial allocation
for a resizable object.

Easel usually handles its own reallocation of resizable objects. For
instance, many resizable objects have an interface called something
like \ccode{\_Add()} or \ccode{\_Push()} for storing the next element
in the object, and this interface will deal with increasing allocation
size as needed.  In a few cases, a public \ccode{\_Grow()} interface
is provided for reallocating an object to a larger size, in cases
where a caller might need to grow the object itself. \ccode{\_Grow()}
only increases an allocation when it is necessary, and it makes that
check immediately and efficiently, so that a caller can call
\ccode{\_Grow()} before every attempt to add a new element without
worrying about efficiency. An example of where a public
\ccode{\_Grow()} interface is generally provided is when an object
might be input from different file formats, and an application may
need to create its own parser. Although creating an input parser
requires familiarity with the Easel object's internal data structures,
at least the \ccode{\_Grow()} interface frees the caller from having
to understand its memory management.

Resizable objects necessarily waste some memory, because they are
overallocated in order to reduce the number of calls to
\ccode{malloc()}.  The wastage is bounded (to a maximum of two-fold,
for the default doubling strategies, once an object has exceeded its
initial allocation size) but nonetheless may not always be tolerable.

In summary: 

\begin{sreapi}
\hypertarget{ifc:Grow}
{\item[\_Grow(obj)]}

A \ccode{\_Grow()} function checks to see if \ccode{obj} can hold
another element. If not, it increases the allocation, according to
internally stored rules on reallocation strategy (usually, by
doubling). 
\end{sreapi}

\begin{sreapi}
\hypertarget{ifc:GrowTo}
{\item[\_GrowTo(obj, n)]}

A \ccode{\_GrowTo()} function checks to see \ccode{obj} is large
enough to hold \ccode{n} elements. If not, it reallocates to at least
that size.
\end{sreapi}

%%%%
  \subsubsection{reusable objects}
%%%%

Memory allocation is computationally expensive. An application needs
to minimize \ccode{malloc()/free()} calls in performance-critical
regions. In loops where one \ccode{\_Destroy()}'s an old object only
to \ccode{\_Create()} the next one, such as a sequential input loop
that processes objects from a file one at a time, one generally wants
to \ccode{\_Reuse()} the same object instead:

\begin{sreapi}
\hypertarget{ifc:Reuse}
{\item[\_Reuse(obj)]}

A \ccode{\_Reuse()} interface takes an existing object and
reinitializes it as a new object, while reusing as much memory as
possible. Any state information that was specific to the problem the
object was just used for is reinitialized. Any allocations and state
information specific to those allocations are preserved (to the extent
possible).  A \ccode{\_Reuse()} call should exactly replace (and be
equivalent to) a \ccode{\_Destroy()/\_Create()} pair. If the object is
growable, it typically would keep the last allocation size, and it
must keep at least the same allocation size that a default
\ccode{\_Create()} call would give.

If the object is arbitrarily resizeable and it has a \ccode{nredline}
control on its memory, the allocation is shrunk back to
\ccode{nredline} (which must be at least the default initial
allocation).

\end{sreapi}

For example:

\begin{cchunk}
   ESL_SQFILE *sqfp;
   ESL_SQ     *sq;

   esl_sqfile_Open(\"foo.fa\", eslSQFILE_FASTA, NULL, &sqfp);
   sq = esl_sq_Create();
   while (esl_sqio_Read(sqfp, sq) == eslOK)
    {
       /* do stuff with this sq */
       esl_sq_Reuse(sq);
    }
   esl_sq_Destroy(sq);
\end{cchunk}

%%%%
  \subsubsection{other}
%%%%
\begin{sreapi}
\hypertarget{ifc:Sizeof}
{\item[size\_t \_Sizeof(obj)]}

Returns the total size of an object and its allocations, in bytes.
\end{sreapi}


%%%%%%%%%%%%%%%%
 \subsection{Accessing information in objects}
%%%%%%%%%%%%%%%%

\begin{sreapi}

\hypertarget{ifc:Is}
{\item[\_Is*(obj)]}

Performs some specific test of the internal state of an
object, and returns \ccode{TRUE} or \ccode{FALSE}.

\hypertarget{ifc:Get}
{\item[value = \_Get*(obj, ...)]}

Retrieves some specified data from \ccode{obj} and returns it
directly. Because no error code can be returned, a \ccode{\_Get}
call must be a simple access call within the object, guaranteed to
succeed. \ccode{\_Get()} methods may often be implemented as macros.
(\ccode{\_Read} or \ccode{\_Fetch} interfaces are for more complex
access methods that might fail, and require an error code return.)

\hypertarget{ifc:Read}
{\item[\_Read*(obj, ..., \&ret\_value)]}

Retrieves some specified data from \ccode{obj} and puts it in
\ccode{ret\_value}, where caller has provided (and already allocated,
if needed) the space for \ccode{ret\_value}.

\hypertarget{ifc:Fetch}
{\item[\_Fetch*(obj, ..., \&ret\_value)]}

Retrieves some specified data from \ccode{obj} and puts it in
\ccode{ret\_value}, where space for the returned value is allocated by
the function. Caller becomes responsible for free'ing that space.

\hypertarget{ifc:Set}
{\item[\_Set*(obj, value)]}

Sets some value(s) in \ccode{obj} to \ccode{value}. If a value was
already set, it is replaced with the new one. If any memory needs to
be reallocated or free'd, this is done. \ccode{\_Set} functions have
some appropriate longer name, like \ccode{\_SetZero()} (set something
in an object to zero(s)), or \ccode{esl\_dmatrix\_SetIdentity()} (set
a dmatrix to an identity matrix).

\hypertarget{ifc:Format}
{\item[\_Format*(obj, fmtstring, ...)]}

Like \ccode{\_Set}, but with \ccode{sprintf()}-style semantics.  Sets
some string value in \ccode{obj} according to the
\ccode{sprintf()}-style \ccode{fmtstring} and any subsequence
\ccode{sprintf()}-style arguments. If a value was already set, it is
replaced with the new one. If any memory needs to be reallocated or
free'd, this is done.  \ccode{\_Format} functions have some
appropriate longer name, like
\ccode{esl\_msa\_FormatSeqDescription()}.

Because \ccode{fmtstring} is a \ccode{printf()}-style format string,
it must not contain '\%' characters. \ccode{\_Format*} functions
should only be used with format strings set by a program; they should
not be used to copy user input that might contain '\%' characters.
\end{sreapi}


%%%%%%%%%%%%%%%%
\subsection{Debugging, testing, development}
%%%%%%%%%%%%%%%%

\begin{sreapi}
\hypertarget{ifc:Validate}
{\item[\_Validate*(obj, errbuf...)]}

Checks that the internals of \ccode{obj} are all right. Returns
\ccode{eslOK} if they are, and returns \ccode{eslFAIL} if they
aren't. Additionally, if the caller provides a non-\ccode{NULL}
message buffer \ccode{errbuf}, on failure, an informative message
describing the reason for the failure is formatted and left in
\ccode{errbuf}. If the caller provides this message buffer, it must
allocate it for at least \ccode{eslERRBUFSIZE} characters.

Failures in \ccode{\_Validate()} routines are handled by
\ccode{ESL\_FAIL()} (or \ccode{ESL\_XFAIL()}, if the validation
routine needs to do any memory cleanup).  Validation failures are
classified as normal (returned) errors so that \ccode{\_Validate()}
routines can be used in production code -- for example, to validate
user input.

At the same time, because the \ccode{ESL\_FAIL()} and
\ccode{ESL\_XFAIL()} macros call the stub \ccode{esl\_fail()}, you can
set a debugging breakpoint on \ccode{esl\_fail} to get a
\ccode{\_Validate()} routine fail immediately at whatever test
failed. 

The \ccode{errbuf} message therefore can be coarse-grained
(``validation of object X failed'') or fine-grained (``in object X,
data element Y fails test Z''). A validation of user input (which we
expect to fail often) should be fine-grained, to return maximally
useful information about what the user did wrong. A validation of
internal data can be very coarse-grained, knowing that a developer can
simply set a breakpoint in \ccode{esl\_fail()} to get at exactly where
a validation failed.

A \ccode{\_Validate()} function is not intended to test all possible
invalid states of an object, even if that were feasible. Rather, the
goal is to automatically catch future problems we've already seen in
past debugging and testing. So a \ccode{\_Validate()} function is a
place to systematically organize a set of checks that essentially
amount to regression tests against past debugging/testing efforts.

\hypertarget{ifc:Compare}
{\item[\_Compare*(obj1, obj2...)]}

Compares \ccode{obj1} to \ccode{obj2}. Returns \ccode{eslOK} if the
contents are judged to be identical, and \ccode{eslFAIL} if they
differ. When the comparison involves floating point scalar
comparisons, a fractional tolerance argument \ccode{tol} is also
passed. 

Failures in \ccode{\_Compare()} functions are handled by
\ccode{ESL\_FAIL()} (or \ccode{ESL\_XFAIL()}, if the validation
routine needs to do any memory cleanup), because they may be used in a
context where a ``failure'' is expected; for example, when using
\ccode{esl\_dmatrix\_Compare()} as a test for successful convergence
of a matrix algebra routine. 

However, the main use of \ccode{\_Compare()} functions is in unit
tests. During debugging and development, we want to see exactly where
a comparison failed, and we don't want to have to write a bunch
laboriously informative error messages to get that information.
Instead we can exploit the fact that the \ccode{ESL\_FAIL()} and
\ccode{ESL\_XFAIL()} macros call the stub \ccode{esl\_fail()}; you can
set a debugging breakpoint in \ccode{esl\_fail()} to stop execution in
the failure macros.

\hypertarget{ifc:Dump}
{\item[\_Dump*(FILE *fp, obj...)]}

Prints the internals of an object in human-readable, easily parsable
tabular ASCII form. Useful during debugging and development to view
the entire object at a glance. Returns \ccode{eslOK} on success.
Unlike a more robust \ccode{\_Write()} call, \ccode{\_Dump()} call may
assume that all its writes will succeed, and does not need to check
return status of \ccode{fprintf()} or other system calls, because it
is not intended for production use.


\hypertarget{ifc:TestSample}
{\item[\_TestSample(ESL\_RANDOMNESS *rng, ..., OBJTYPE **ret\_obj)]}

Create an object filled with randomly sampled values for all data
elements. The aim is to exercise valid values and ranges, and
presence/absence of optional information and allocations, but not to
obsess about internal semantic consistency. For example, we use
\ccode{\_TestSample()} calls in testing MPI send/receive
communications routines, where we don't care so much about the meaning
of the object's contents, as we do about faithful transmission of any
object with valid contents. 

A \ccode{\_TestSample()} call produces an object that is sufficiently
valid for other debugging tools, including \ccode{\_Dump()},
\ccode{\_Compare()}, and \ccode{\_Validate()}. However, because
elements may be randomly sampled independently, in ways that don't
respect interdependencies, the object may contain data inconsistencies
that make the object invalid for other purposes.  Contrast
\ccode{\_Sample()} routines, which generate fully valid objects for
all purposes, but which may not exercise the object's fields as
thoroughly.

\end{sreapi}

%%%%%%%%%%%%%%%%
\subsection{Miscellaneous other interfaces}
%%%%%%%%%%%%%%%%

\begin{sreapi}
\hypertarget{ifc:Write}
{\item[\_Write(fp, obj)]}
Writes something from an object to an output stream \ccode{fp}. Used
for exporting and saving files in official data exchange formats.
\ccode{\_Write()} functions must be robust to system write errors,
such as filling or unexpectedly disconnecting a disk. They must check
return status of all system calls, and throw an \ccode{eslEWRITE}
error on any failures.




\hypertarget{ifc:Encode}
{\item[code = \_Encode*(char *s)]}

Given a string \ccode{<s>}, match it case-insensitively against a list
of possible string values and convert this visible representation to
its internal \ccode{\#define} or \ccode{enum} code. For example,
\ccode{esl\_sqio\_EncodeFormat("fasta")} returns
\ccode{eslSQFILE\_FASTA}. If the string is not recognized, returns a
code signifying ``unknown''. This needs to be a normal return (not a
thrown error) because the string might come from user input, and might
be invalid.


\hypertarget{ifc:Decode}
{\item[char *s = \_Decode*(int code)]}

Given an internal code (an \ccode{enum} or \ccode{\#define} constant),
return a pointer to an informative string value, for diagnostics and
other output. The string is static. If the code is not recognized,
throws an \ccode{eslEINVAL} exception and returns \ccode{NULL}.

\end{sreapi}






%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Writing unit tests}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

An Easel test driver runs a set of individual unit tests one after
another.  Sometimes there is one unit test assigned to each exposed
function in the API. Sometimes, it makes sense to test several exposed
functions in a single unit test function.

A unit test for \ccode{esl\_foo\_Baz()} is named \ccode{static void
utest\_Baz()}. 

Upon success, unit tests return void.

Upon any failure, a unit test calls \ccode{esl\_fatal()} with an error
message, and terminates. It should not use any other error-catching
mechanism. It aids debugging if the test program terminates
immediately, using a single function that we can easily breakpoint at
(\ccode{break esl\_fatal} in GDB). It must not use \ccode{abort()},
for example, because this will screw up the output of scripts running
automated tests in \ccode{make check} and \ccode{make dcheck}, such as
\emcode{sqc}. \emcode{sqc} traps \ccode{stderr} from
\ccode{esl\_fatal()} correctly. A unit test must not use
\ccode{exit(1)} either, because that leaves no error message, so
someone running a test program on the command line can't easily tell
that it failed.

Unit tests should attempt to deliberately generate exceptions and
failures, and test that the appropriate error code is returned.  Unit
tests must temporarily register a nonfatal error handler when testing
exceptions. 

Every function, procedure, and macro in the exposed API shall be
tested by one or more unit tests. The unit tests aim for complete code
coverage. This is measured by code coverage tests using \ccode{gcov}.



%%%%%%%%%%%%%%%%
\subsection{Dealing with expected stochastic failures in unit tests}
%%%%%%%%%%%%%%%%

Many unit tests are based on statistical samples and/or random number
generation.  For example, we test a maximum likelihood parameter
fitting routine by fitting to samples generated with known parameters,
and testing that the estimated parameters are close enough to the true
parameters.  The trouble is defining ``close enough''. There may be a
small but finite probability that such a test will fail. I call these
``stochastic failures''.  We don't want tests to fail due to expected
statistical deviations, but neither do we want to set p-values so
loose that a flaw escapes notice.

Current Easel strategy is to have such unit tests reinitialize the RNG
to a predetermined fixed seed known to work. Optionally, the test can
be made to use the RNG without reinitialization (therefore allowing
stochastic failures to occur), with a \ccode{-x} option to the test
driver. 
% example: esl_mixdchlet

In the test driver, these unit tests need to be run last; unit tests
that don't have a stochastic failure mode are run first. This is so
the \ccode{-s <seed>} option for setting the RNG seed takes effect
properly. (Otherwise, having a unit test reset the RNG seed would
override the \ccode{-s <seed>} setting.}

Otherwise the default for \ccode{<seed>} should be 0, so all other
tests are randomized from run to run.

In some older Easel code, fixed RNG seeds are used for tests that can
stochastically fail. The newer approach is preferable because it gives
more fine-grained control - only some utests need to deal with
stochastic failure, not all of them.

%%%%%%%%%%%%%%%%
\subsection{Using temporary files in unit tests}
%%%%%%%%%%%%%%%%

If a unit test or testdriver needs to create a named temporary file
(to test i/o), the tmpfile is created with
\ccode{esl\_tmpfile\_named()}:

\begin{cchunk}
   char  tmpfile[16] = "esltmpXXXXXX";
   FILE *fp;

   if (esl_tmpfile_named(tmpfile, &fp) != eslOK) esl_fatal("failed to create tmpfile");
   write_stuff_to(fp);
   fclose(fp);

   if ((fp = fopen(tmpfile)) == NULL) esl_fatal("failed to open tmpfile");
   read_stuff_from(fp);
   fclose(fp);

   remove(tmpfile);
\end{cchunk}

Thus tmp files created by Easel's test suite have a common naming
convention, and are put in the current working directory. On a test
failure, the tmp file remains, to assist debugging; on a test success,
the tmp file is removed. The \ccode{make clean} targets in Makefiles
are looking to remove files matching the target \ccode{esltmp??????}.

It is important to declare it as \ccode{char tmpfile[16]} rather than
\ccode{char *tmpfile}. Compilers are allowed to treat the string in a
\ccode{char *foo = "bar"} initialization as a read-only constant.





%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Easel development environment; using development tools}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Easel is developed primarily on GNU/Linux and Mac OS/X systems with
the following tools installed:

\begin{tabular}{ll}
{\bfseries Tool}  & {\bfseries Use} \\
\emcode{emacs}    &  editor   \\
\emcode{gcc}      &  GNU compiler \\
\emcode{icc}      &  Intel compiler \\
\emcode{gdb}      &  debugger\\
\emcode{autoconf} &  platform-independent configuration manager, Makefile generator\\
\emcode{make}     &  build/compilation management\\
\emcode{valgrind} &  memory bounds and leak checking\\
\emcode{gcov}     &  code coverage analysis\\
\emcode{gprof}    &  profiling and optimization (GNU)\\
\emcode{shark}    &  profiling and optimization (Mac OS/X)\\
\LaTeX            &  documentation typesetting\\
Subversion        &  revision control\\
Bourne shell (\ccode{/bin/sh}) & scripting\\
Perl              &  scripting\\
\end{tabular}

Most of these are standard and well-known. The following sections
describe some Easel work patterns with some of the less commonly used
tools.

%%%%%%%%%%%%%%%%
\subsection{Using valgrind to find memory leaks and more}
%%%%%%%%%%%%%%%%

We use \emcode{valgrind} to check for memory leaks and other problems,
especially on the unit tests:

\begin{cchunk}
  % valgrind ./esl_buffer_utest
\end{cchunk}

The \ccode{valgrind\_report.pl} script in \ccode{testsuite} automates
valgrind testing for all Easel modules. To run it:

\begin{cchunk} 
   % cd testsuite
   % ./valgrind_report.pl > valgrind.report
\end{cchunk}




%%%%%%%%%%%%%%%%
\subsection{Using gcov to measure unit test code coverage}
%%%%%%%%%%%%%%%%

We use \emcode{gcov} to measure code coverage of our unit
testing. \emcode{gcov} works best with unoptimized code.  The code
must be compiled with \emcode{gcc} and it needs to be compiled with
\ccode{-fprofile-arcs -ftest-coverage}. The configure script knows
about this: give it the \ccode{--enable-gcov} option. An example:

\begin{cchunk}
  % make distclean
  % ./configure --enable-gcov
  % make esl_buffer_utest
  % ./esl_buffer_utest
  % gcov esl_buffer.c
  File 'esl_buffer.c'
  Lines executed:73.85% of 589
  esl_buffer.c:creating 'esl_buffer.c.gcov'
  % emacs esl_buffer.c.gcov
\end{cchunk}

The file \ccode{esl\_buffer.c.gcov} contains an annotated source listing
of the \ccode{.c} file, showing which lines were and weren't covered
by the test suite.

The \ccode{coverage\_report.pl} script in \ccode{testsuite} automates coverage
testing for all Easel modules. To run it:

\begin{cchunk} 
   % cd testsuite
   % coverage_report.pl > coverage.report
\end{cchunk}


%%%%%%%%%%%%%%%%
\subsection{Using gprof for performance profiling}
%%%%%%%%%%%%%%%%

On a Linux machine (gprof does not work on Mac OS/X, apparently):

\begin{cchunk}
   % make distclean
   % ./configure --enable-gprof
   % make
\end{cchunk}

Run any program you want to profile, then:

\begin{cchunk}
   % gprof -l <progname>
\end{cchunk}

%%%%%%%%%%%%%%%%
\subsection{Using the clang static analyzer, checker}
%%%%%%%%%%%%%%%%

The clang static analyzer for Mac OS/X is at
\url{http://clang-analyzer.llvm.org/}. I install it by moving its
entire distro directory (checker-276, for example) to
\ccode{/usr/local}, and symlinking to \ccode{checker}.
My \ccode{bashrc} has:

\begin{cchunk}
test -d /usr/local/checker         && PATH=${PATH}:/usr/local/checker
\end{cchunk}

and that puts \prog{scan-build} in my \ccode{PATH}.

To use it:

\begin{cchunk}
   % scan-build ./configure --enable-debugging
   % scan-build make
\end{cchunk}

It'll give you a scan-view command line, including the name of its
output html file, so you can then visualize and interact with the
results:

\begin{cchunk}
   % scan-view /var/folders/blah/baz/foo
\end{cchunk}






%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Documentation}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%
\subsection{Structured function headers read by autodoc}
%%%%%%%%%%%%%%%%
The documentation for Easel's functions is embedded in the source code
itself, rather than being in separate files. A homegrown documentation
extraction tool (\prog{autodoc}) is used to process the source files
and extract and format the documentation.

An important part of the documentation is the documentation for
individual functions.  Each Easel function is preceded by
documentation in the form of a structured comment header that is
parsed by \prog{autodoc}. For example:

\input{cexcerpts/function_comment_example}

\prog{autodoc} can do one of three things with the text that follows
these tags: it can ignore it, use it verbatim, or process
it. \esldef{Ignored} text is documentation that resides only in the
source code, like the incept date and the notebook
crossreferences.\footnote{Eventually, we will probably process the
\ccode{Args:} part of the header, but for now it is ignored.}
\esldef{Verbatim} text is picked up by \prog{autodoc} and formatted as
\verb+\ccode{}+ in the \LaTeX\ documentation. \esldef{Processed} text
is interpeted as \LaTeX\ code, with a special addition that angle
brackets are used to enclose C code words, such as the argument names.
\prog{autodoc} recognizes the angle brackets and formats the enclosed
text as \verb+\ccode{}+.  Unprotected underscore characters are
allowed inside these angle brackets; \prog{autodoc} protects them
appropriately when it generates the \LaTeX. Citations, such as
\verb+\citep{MolerVanLoan03}+, are formatted for the \LaTeX\
\verb+natbib+ package.

The various fields are:

\begin{sreitems}{\textbf{Function:}}
\item[\textbf{Function:}] 
  The name of the function.  \prog{autodoc} uses this line to
  determine that it's supposed to generate a documentation entry here.
  \prog{autodoc} checks that it matches the name of the immediately
  following C function. One line; verbatim; required.

\item[\textbf{Synopsis:}] 
  A short one-line summary of the function. \ccode{autodoc -t} uses this
  line to generate the API summary tables that appear in this guide.
  One line; processed; not required for \prog{autodoc} itself, but
  required by \ccode{autodoc -t}. 

\item[\textbf{Incept:}] Records the author/date of first
  draft. \prog{autodoc} doesn't use this line.  Used to help track
  development history. The definition of ``incept'' is often fuzzy,
  because Easel is a palimpsest of rewritten code. This line often
  also includes a location, such as \ccode{[AA 673 over Greenland]},
  for no reason other than to remember how many weird places I've
  managed to get work done in..

\item[\textbf{Purpose:}] The main body. \prog{autodoc} processes this
  to produce the \TeX documentation. It explains the purpose of the
  function, then precisely defines what the caller must provide in
  each input argument, and what the caller will get back in each
  output argument. It should be written and referenced as if it will
  appear in the user guide (because it will). Multiline; processed by
  \prog{autodoc}; required.

\item[\textbf{Args:}] A tabular-ish summary of each argument. Not
  picked up by \prog{autodoc}, at least not at present. The
  \ccode{Purpose:} section instead documents each option in free text.
  Multiline and tabular-ish; ignored by \prog{autodoc}; optional.

\item[\textbf{Returns:}] The possible return values from the function,
  starting with what happens on successful completion (usually, return
  of an \ccode{eslOK} code). Also indicates codes for unsuccessful
  calls that are normal (returned) errors. If there are output
  argument pointers, documents what they will contain upon successful
  and unsuccessful return, and whether any of the output involved
  allocating memory that the caller must free.

\item[\textbf{Throws:}] The possible exceptions thrown by the
  function, listing what a program that's handling its own exceptions
  will have to deal with. (Programs should never assume that this list
  is complete.) Programs that are letting Easel handle exceptions do
  not have to worry about any of the thrown codes.  The state of
  output argument pointers is documented -- generally, all output is
  set to \ccode{NULL} or \ccode{0} values when exceptions happen.
  After a thrown exception, there is never any memory allocation in
  output pointers that the caller must free.

\item[\textbf{Xref:}] Crossreferences to notebooks (paper or
  electronic) and to literature, to help track the history of the
  function's development and rationale.\footnote{A typical reference
  to one of SRE's notebooks is \ccode{STL10/143}, indicating St. Louis
  notebook 10, page 143.} Personal developer notebooks are of course
  not immediately available to all developers (especially bound paper
  ones) but still, these crossreferences can be traced if necessary.
\end{sreitems}

\subsection{cexcerpt - extracting C source snippets}

The \prog{cexcerpt} program extracts snippets of C code verbatim from
Easel's C source files.

The \ccode{documentation/Makefile} runs \prog{cexcerpt} on every
module .c and .h file. The extracted cexcerpts are placed in .tex
files in the temporary \ccode{cexcerpts/} subdirectory.

Usage: \ccode{cexcerpt <file.c> <dir>}. Processes C source file
\ccode{file.c}; extracts all tagged excerpts, and puts them in a file
in directory \ccode{<dir>}.

An excerpt is marked with special comments in the C file:
\begin{cchunk}
/*::cexcerpt::my_example::begin::*/
   while (esl_sq_Read(sqfp, sq) == eslOK)
     { n++; }
/*::cexcerpt::my_example::end::*/
\end{cchunk}

The cexcerpt marker's format is \ccode{::cexcerpt::<tag>::begin::} (or
end). A comment containing a cexcerpt marker must be the first text on
the source line. A cexcerpt comment may be followed on the line by
whitespace or a second comment.

The \ccode{<tag>} is used to construct the file name, as
\ccode{<tag>.tex}.  In the example, the tag \ccode{my\_example} creates
a file \ccode{my\_example.tex} in \ccode{<dir>}.

All the text between the cexcerpt markers is put in the file.  In
addition, this text is wrapped in a \ccode{cchunk} environment.  This
file can then be included in a \LaTeX\ file.

For best results, the C source should be free of TAB characters.
"M-x untabify" on the region to clean them out.

Cexcerpts can't overlap or nest in any way in the C file. Only one tag
can be active at a time.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{The .tex file}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%




%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Portability notes}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Easel is intended to be widely portable. We adhere to the ANSI C99
standard. Any dependency on higher-level functionality (including
POSIX, X/Open, or system-specific stuff) is optional, and Easel is
capable of working around its absence at compile-time.

Although we do not currently include Windows machines in our
development environment, we are planning for the day when we do. Easel
should not include any required UNIX-specific code that wouldn't port
to Windows.\footnote{Though it probably does, which we'll discover
  when we first try to compile for Windows.}


% xref J7/83.
\paragraph{Why not define \ccode{\_POSIX\_C\_SOURCE}?} You might think
it would be a good idea to define \ccode{\_POSIX\_C\_SOURCE} to
\ccode{200112L} or some such, to try to enforce the portability of our
POSIX-dependent code. This doesn't work; don't do it.  According to
the standards, if you define \ccode{\_POSIX\_C\_SOURCE}, the host must
\emph{disable} anything that's \emph{not} in the POSIX
standard. However, Easel \emph{is} allowed to optionally use
system-dependent non-POSIX code. A good example is
\ccode{esl\_threads.c::esl\_threads\_CPUCount()}. There is no
POSIX-compliant way to check for the number of available processors on
a system.\footnote{Apparently the POSIX threads standards committee
  intends it that way; see
  \url{http://ansi.c.sources.free.fr/threads/butenhof.txt}.} 
Easel's implementation tries to find one of several system-specific
alternatives, including the non-POSIX function \ccode{sysctl{}}.