File: tc2html.ms

package info (click to toggle)
troffcvt 1.04%2Brepack1-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 3,416 kB
  • sloc: ansic: 13,110; makefile: 6,847; perl: 1,583; cpp: 333; sh: 215
file content (992 lines) | stat: -rw-r--r-- 22,361 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
.\" this document requires the tmac.wrprc macros
.\"
.\" $(TROFF) $(MSMACROS) tmac.wrprc thisfile
.\"
.\" revision date - change whenever this file is edited
.ds RD 9 March 1997
.\"
.EH 'tc2html Notes'- % -''
.OH ''- % -'tc2html Notes'
.OF 'Revision date:\0\0\*(RD''Printed:\0\0\n(dy \*(MO 19\n(yr'
.EF 'Revision date:\0\0\*(RD''Printed:\0\0\n(dy \*(MO 19\n(yr'
.\"
.de St	\" troffcvt special text
\\&\\$3\fB@\\$1\fR\\$2
..
.de Cl	\" troffcvt or RTF control
\\&\\$3\fB\e\\$1\fR\\$2
..
.de Rq	\" troff request
\\&\\$3\fB\.\\$1\fR\\$2
..
.de Es	\" troff escape
\\&\\$3\fB\e\\$1\fR\\$2
..
.TL
.ps +2
tc2html Notes
.ps
.AU
Paul DuBois
.H*ahref mailto:dubois@primate.wisc.edu
dubois@primate.wisc.edu
.H*aend
.AI
.H*ahref http://www.primate.wisc.edu/
Wisconsin Regional Primate Research Center
.H*aend
Revision date:\0\0\*(RD
.\"
.H*toc*title "Table of Contents"
.\"
.Ah Introduction
.\"
.LP
.I tc2html
is a postprocessor for converting
.I troffcvt
output to HTML.
It's used by the
.I troff2html
front end.
This document describes how
.I tc2html
works and some of the design issues involved in writing it.
.LP
In general, the goal of
.I tc2html
is that you should get reasonable HTML output
with no need for special treatment of the
.I troff
input file.
The most important thing is that you use a standard macro package.
However, there are some additional principles you can follow that will
improve the quality of the HTML that
.I tc2html
generates.
For example, it's possible to embed hypertext links in your
.I troff
source with a little prior planning.
Techniques for such things are discussed in the section
.H*ahref #better-html
``Generating Better HTML.''
.H*aend
If you're not interested in implementation details, you can skip directly
to that section.
.\"
.Ah "Output Format"
.\"
.LP
.I tc2html
reads output from
.I troffcvt
and produces an HTML document that has the following general form:
.Ps
<HTML>
<HEAD>
<TITLE>\fItitle text\fP</TITLE>
</HEAD>
<BODY>
<H1>\fItitle text\fP</H1>
\fIbody text\fP
</BODY>
</HTML>
.Pe
The document HEAD part may be missing if
.I tc2html
detects no title in the input.
In this case the initial heading at the beginning of the document BODY
part also will be missing.
The entire document BODY may be missing or empty
if the input document is empty.
.\"
.Ah "Determining Input Document Structure"
.\"
.LP
HTML documents typically are highly structured, being written in terms of
elements such as headers, paragraphs, lists, and displays (preformatted
text).
But
.I troffcvt
output normally contains very little structural
information beyond markers like those for
inter-paragraph spacing and line breaks (in the form of
.Cl space
and
.Cl break
control lnes).
The result when
.I tc2html
reads such
.I troffcvt
output is that it produces HTML that is relatively unstructured
\*- just a lot of text broken by occasional <P> or <BR> markers.
.LP
However, if your document is marked up using macros from
a macro package such as
.B \-ms
or
.B \-man ,
it's possible to get output from
.I troffcvt
that's much more suitable for
.I tc2html .
The trick is to map
.I troff
requests to HTML
structure markers, rather than trying to guess the structure from the low-level
.I troffcvt
output that normally results from those requests.
This is accomplished using the following strategy:
.Ls B
.Li
Extend the
.I troffcvt
output language by defining an
.Cl html
control that provides information to
.I tc2html
about structural elements within the
.I troffcvt
output.
For example,
.Cl html
.B para
indicates the beginning of a paragraph.
.Li
Provide (in a
.I troffcvt
action file) a set of
HTML-specific macros that generate the appropriate
.Cl html
controls for the various structural elements.
For example,
.Rq H*para
generates
.Cl html
.B para .
.Li
For the important structure-related macros in your
macro package, redefine them (in a
.I troffcvt
action file)
so they're expressed in terms of the HTML-specific macros.
(It's posssible, of course, to redefine the macros from the macro package
so they generate the
.Cl html
controls themselves.
But having the
.Cl html
controls available through a set of macros allows the macros
to be
invoked directly in your document.
This is important for some HTML constructs that have no
.I troff
analog, such as hyperlinks.)
.Le
Note that ``extending'' the
.I troffcvt
output language to include the
.Cl html
control is done using request definitions in an action file.
Source-level changes to
.I troffcvt
itself are not needed.
.LP
The effect of the strategy outlined above
is to remap the macros in your macro package
from their usual actions onto actions that produce document structure
information that
.I tc2html
can recognize.
For this to work well, all the important structure-related
macros in a macro package must be redefined, so
the redefinition files used for
.I tc2html
tend to be more extensive than those used for other postprocessors.
This is really the source of most of the work involved in getting
.I tc2html
to function well.
Once a set of redefinitions is written for a given macro package,
translation from
.I troff
to HTML is a straighforward process that usually generates fairly reasonable
HTML.
.LP
Here's an example of how the strategy described above works in practice.
The
.Rq LP
macro in the
.B \-ms
macro package means ``begin paragraph.''
But
.Rq LP
typically is implemented by executing several other requests
(restore font, margins, adjustment, spacing, point size, etc.), and the
.I troffcvt
output you'd get by processing those requests
really contains nothing that specifically indicates a paragraph.
To work around this, we use the fact that
.I tc2html
interprets
.Cl html
.B para
as indicating a paragraph beginning, and define
a macro to generate that control:
.Ps
req H*para eol output-control "html para"
.Pe
Then we can redefine the
.Rq LP
macro in terms of the
.Rq H*para
macro:
.Ps
req LP eol \e
		break center 0 fill adjust b font R \e
		push-string ".H*para\en"
.Pe
The
.B break ,
.B fill ,
.B adjust ,
and
.B font
actions cause
.I troffcvt
to adjust its internal state to match the effect that the
.Rq LP
macro normally has.
The call to
.Rq H*para
results in
.Cl html
.B para
in the output, so that
.I tc2html
can recognize the paragraph beginning.
.LP
The
.Cl html
markers that
.I tc2html
recognizes are shown below:
.Ps
.ta 2.75i
\ehtml title	\fRBegin document title\fP
\ehtml header \f(CIN\fP	\fRBegin level \f(CIN\fP header\f(CW
\ehtml header-end	\fREnd header (any level)\fP
\ehtml para	\fRBegin paragraph\fP
\ehtml blockquote	\fRBegin block quote\fP
\ehtml blockquote-end	\fREnd block quote\fP
\ehtml list	\fRBegin list\fP
\ehtml list-end	\fREnd list\fP
\ehtml list-item	\fRBegin list item\fP
\ehtml display	\fRBegin display (preformatted text)\fP
\ehtml display-end	\fREnd display\fP
\ehtml display-indent \f(CIN\fP	\fRSet display indent to \f(CIN\fP spaces\f(CW
\ehtml definition-term	\fRBegin definition list term\fP
\ehtml definition-desc	\fRBegin definition list description\fP
\ehtml shift-right	\fRShift left margin right\fP
\ehtml shift-left	\fRShift left margin left\fP
\ehtml anchor-href \f(CIURL\fP	\fRBegin HREF anchor for link to \f(CIURL\f(CW
\ehtml anchor-name \f(CILABEL\fP	\fRBegin NAME anchor with label \f(CILABEL\f(CW
\ehtml anchor-toc \f(CIN\fP	\fRBegin NAME anchor for level \f(CIN\fP TOC entry\f(CW
\ehtml anchor-end	\fREnd anchor (any kind)\fP
.Pe
The
.I troff -level
macros used to generate the
.Cl html
controls are shown below.
These macros are defined in the action file
.I actions-html :
.Ps
.ta 2.75i
\&.H*title	\fRBegin document title\fP
\&.H*header \f(CIN\fP	\fRBegin level \f(CIN\fP header\f(CW
\&.H*header-end	\fREnd header (any level)\fP
\&.H*para	\fRBegin paragraph\fP
\&.H*bq	\fRBegin block quote\fP
\&.H*bq-end	\fREnd block quote\fP
\&.H*list	\fRBegin list\fP
\&.H*list-end	\fREnd list\fP
\&.H*list-item	\fRBegin list item\fP
\&.H*disp	\fRBegin display (preformatted text)\fP
\&.H*disp-end	\fREnd display\fP
\&.H*disp-indent \f(CIN\fP	\fRSet display indent to \f(CIN\fP spaces\f(CW
\&.H*dterm	\fRBegin definition list term\fP
\&.H*ddesc	\fRBegin definition list description\fP
\&.H*shift-right	\fRShift left margin right\fP
\&.H*shift-left	\fRShift left margin left\fP
\&.H*ahref \f(CIURL\fP	\fRBegin HREF anchor for link to \f(CIURL\f(CW
\&.H*aname \f(CILABEL\fP	\fRBegin NAME anchor with label \f(CILABEL\f(CW
\&.H*atoc \f(CIN\fP	\fRBegin NAME anchor for level \f(CIN\fP TOC entry\f(CW
\&.H*aend	\fREnd anchor (any kind)\fP
.Pe
Note that since these names are longer than two characters, they cannot
be used in compatibility mode.
.\"
.Ah "Invoking tc2html"
.\"
.LP
The
.Cl html
controls are defined in a file
.I actions-html
that you can access on the
.I troffcvt
command line using
.B \-a
.B actions-html .
If you use a macro package
.B \-m \f[BI]xx\fP,
you specify it on the command line, along with the general
and HTML-specific
.I troffcvt
redefinitions for that macro package; these are in the action files
.I tc.mxx
and
.I tc.mxx-html .
Thus, to translate a file that you'd normally process using
.B \-ms ,
the command would look like this:
.Ps
% \f(CBtroffcvt -a actions.html -ms -a tc.ms -a tc.ms-html\fP \f(CImyfile.ms\fP \f(CB\e
		| tc2html >\fP \f(CImyfile.html\fP
.Pe
That's pretty ugly, of course; it's better to use a wrapper script like
.I troff2html
that supplies the necessary options for you:
.Ps
% \f(CBtroff2httml -ms\fP \f(CImyfile.ms\fP \f(CB>\fP \f(CImyfile.html\fP
.Pe
.\"
.Ah "Implementation of Various HTML Constructs"
.\"
.LP
This section provides some specifics on how several
.I troff
concepts are turned into HTML elements.
It should be considered illustrative rather than exhaustive.
.\"
.H*aname title-collection
.H*aend
.Bh "Document Titles"
.\"
.LP
Title macros are implemented in terms of
.Rq H*title ,
which generates an
.Cl html
.B title
control.
When
.I tc2html
sees this control, it goes into document HEAD collection mode.
If the document contains a title, the
.Cl html
.B title
line must be the first
.Cl html
control that
.I tc2html
sees.
Should any other
.Cl html
control or document text occur first,
.I tc2html
assumes no title is present.
Any leading document whitespace
.Cl space "" (
or
.Cl break
lines) occurring prior to the title is skipped.
.LP
The title is terminated by the next
.Cl html
line with a structural marker, such as
.Cl html
.B para .
The title text is used to produce the TITLE in the document HEAD part
and the initial header in the document BODY part.
.Cl space
and
.Cl break
lines within the title do
.I not
terminate title text collection; instead, they are turned into spaces in
the title and into <P> and <BR> in the initial header.
Consider the following
.I troff
input (using
.B \-ms
macros):
.Ps
\&.TL
My
\&.sp
Title
\&.LP
This is a line
.Pe
This is converted by
.I troffcvt
into the following:
.Ps
\ehtml title
My
\espace
Title
\ebreak
\ehtml para
This is a line.
.Pe
The output from
.I troffcvt
is converted in turn by
.I tc2html
into this HTML:
.Ps
<HEAD>
<TITLE>
My Title
</TITLE>
</HEAD>
<BODY>
<H2>
My
<P>
Title
</H2>
<P>
This is a line.
.Pe
.B \-T
.I title
may be specified on the
.I tc2html
or
.I troff2html
command line to specify a title explicitly.
It overrides the title in the document if there is one.
.\"
.Bh "Standard Paragraphs"
.\"
.LP
The ``standard'' paragraph is a paragraph with the first line flush left.
There is no mechanism for writing paragraphs with an indented first line;
they're treated simply as standard paragraphs.
.LP
The standard paragraph is implemented in terms of
.Rq H*para ,
which generates an
.Cl html
.B para
control.
This is turned by
.I tc2html
into <P>.
.LP
In the document BODY part,
.Cl space
is also interpreted as a paragraph marker, but
during document title collection,
.Cl space
is treated as described above under ``\c
.H*ahref #title-collection
Document Titles
.H*aend
\&.''
.\"
.Bh "Indented Paragraphs"
.\"
.LP
Indented paragraphs (with or without a hanging tag)
are implemented using definition lists (<DL>...</DL>).
The tag is written as a definition term (<DT>...</DT>)
and the paragraph body is written as a definition description (<DD>...</DD>).
If there is no tag, the term part is empty.
.LP
Indented paragraph macros are implemented in terms of
.Rq H*dterm
and
.Rq H*ddesc ,
which generate
.Cl html
.B definition-term
and
.Cl html
.B definition-desc
controls.
.LP
One problem with mapping indented paragraphs onto definition lists
is that it's not always clear from the
.I troff
input where the list ends.
In HTML, the definition list is a container for which you must write
both a beginning and ending tag, but in
.I troff
only the beginnings of paragraphs are specified.
This problem is handled (perhaps poorly) by closing the list when other HTML
structural elements like a standard paragraph or a header are seen.
Suppose you write something like this:
.Ps
\&.IP (i)
Para 1
\&.IP (ii)
Para 2
\&.LP
Para 3
.Pe
This is converted by
.I troffcvt
into the following:
.Ps
\ehtml definition-term
(i)
\ehtml definition-desc
 Para 1
\ebreak
\ehtml definition-term
(ii)
\ehtml definition-desc
 Para 2
\ebreak
\ehtml para
Para 3
\ebreak
.Pe
When
.I tc2html
sees the first
.Cl definition-term ,
it begins a definition list.
The second
.Cl definition-term
continues the same list.
The
.Cl html
.B para
(resulting from the
.Rq LP )
is part of a different structural element, so
.I tc2html
closes the list and begins a standard paragraph.
The resulting HTML looks like this:
.Ps
<DL>
<DT>
(i)
</DT>
<DD>
Para 1<BR>
</DD>
<DT>
(ii)
</DT>
<DD>
Para 2<BR>
</DD>
</DL>
<P>
Para 3<BR>
.Pe
.\"
.Bh "Right and Left Shifts"
.\"
.LP
In
.I troff ,
the left margin can be shifted right and left, e.g., as is done with the
.B \-ms
and
.B \-man
packages using
.Rq RS
and
.Rq RE .
HTML has no good way of shifting the margin, so shifts are performed
using <UL> and </UL>.
This is admittedly a hack, but it works reasonably well.
Shift macros are redefined to be implemented in terms of
.Rq H*shift*right
and
.Rq H*shift*left ,
which generate
.Cl html
.B shift-right
and
.Cl html
.B shift-left
controls.
These in turn are converted by
.I tc2html
to <UL> and </UL>.
.\"
.Bh "Displays"
.\"
.LP
Displays are implemented as preformatted text (<PRE>...</PRE>).
Tabstops are respected within displays, although they must be approximated
since characters widths are unknown.
.I tc2html
assumes 10 characters/inch for determining the width of tabstops.
.LP
Display macros are implemented in terms of
.Rq H*disp
and
.Rq H*disp*end .
Preformatted text in HTML has no additional indent relative to the left
margin, but
.I troff
displays often are indented a bit.
To handle this,
.Rq H*disp*indent
.I N
can be used to set the display indent to
.I N
spaces.
.LP
.Rq H*disp ,
.Rq H*disp*end ,
and
.Rq H*disp*indent
generate
.Cl html
.B display ,
.Cl html
.B display-end ,
and
.Cl html
.B display-indent
controls.
The first two of these are converted by
.I tc2html
into <PRE> and </PRE>.
.Cl html
.B display-indent
generates no output itself, but causes
.I tc2html
to add spaces to the beginning of each line of a display.
.LP
Centered and right-justified displays are not implemented.
They're treated as regular displays.
.\"
.Bh "Tables"
.\"
.LP
If your input document has tables written in the
.I tbl
language, preprocess the document with
.I tblcvt
rather than with
.I tbl .
Your output will look better that way.
.LP
Table cell borders are hard to do well.
In
.I tbl
you can put a border on any cell boundary, but in HTML a table has either
no borders or borders around every cell.
Currently,
.I tc2html
puts borders around every cell.
.\"
.Bh "Font Handling"
.\"
.LP
Fonts are handled in
.I tc2html
by means of a table that associates four tags with each font name.
The first two tags are used to turn the font on and off in normal text.
The second two tags are used to turn the font on and off in displays.
This table is read at runtime from the
.I html-fonts
file.
Here's an example of what the file might look like:
.Ps
.ta .5i +1.25i +1.5i +1.25i
R	""	""	""	""
I	<I>	</I>	<I>	</I>
B	<B>	</B>	<B>	</B>
BI	<B><I>	</I></B>	<B><I>	</I></B>

C	<TT>	</TT>	""	""
CW	<TT>	</TT>	""	""
CI	<TT><I>	</I></TT>	<I>	</I>
CB	<TT><B>	</B></TT>	<B>	</B>
CBI	<TT><B><I>	</I></B></TT>	<B><I>	</I></B>
.Pe
The difference between the tags for regular text and display text
is that, since browsers implicitly switch the font to monospaced
font in displays, the only thing that can be done for font changes there
is to change the style attributes.
.LP
The initial font when
.I tc2html
begins is
.Cw R
(roman).
When a font change occurs, the new font's begin tag is written out
after terminating the previous font by writing its end tag.
Using the font table just shown, this input:
.Ps
\efont R
abc
\efont I
def
\efont CW
ghi
\efont R
jkl
.Pe
becomes this output:
.Ps
abc<I>def</I><TT>ghi</TT>jkl
.Pe
.\"
.Bh "Tabs"
.\"
.LP
Tabs are ignored except in displays.
Adding extra space to tab over
has no effect in regular paragraphs anyway, because browsers typically
collapse runs of spaces.
.LP
Right-justified and centered tabs are
treated as left-justified tabs.
That is, they're completely botched.
.\"
.H*aname better-html
.H*aend
.Ah "Generating Better HTML"
.\"
.LP
This section describes how you can embed hypertext links in your
.I troff
source and how to produce a table of contents containing clickable
links to the main sections of your document.
.\"
.Bh "Generating Hypertext Links"
.\"
.LP
The
.Cl html
controls used to generate hypertext links are:
.Ps
\ehtml anchor-href \f(CIURL\fP
\ehtml anchor-name \f(CILABEL\fP
\ehtml anchor-end
.Pe
The first two controls generate opening
\f(CW<A HREF=\f(CIURL\f(CI>\fR
and
\f(CW<A NAME=\f(CILABEL\f(CI>\fR
tags; the third generates a closing
.Cw "</A>"
tag.
.LP
To embed hypertext links in your
.I troff
source, you can use the macros
.Rq H*ahref
and
.Rq H*aend ,
or
.Rq H*aname
and
.Rq H*aend .
To write an HREF link, the
.I troff
source looks like this:
.Ps
\&.H*ahref http://www.some.host/some/path
hypertext link
\&.H*aend
.Pe
The resulting HTML looks like this:
.Ps
<A HREF="http://www.some.host/some/path">
hypertext link</A>
.Pe
To write a NAME link, the
.I troff
source looks like this:
.Ps
\&.H*aname my-name
name link
\&.H*aend
.Pe
The resulting HTML looks like this:
.Ps
<A NAME="my-name">
name link</A>
.Pe
Section-header macros are usually redefined to generate a NAME
anchor for the table of contents, so don't surround a section header
with anchor-generating macros.
You'll end up with nested anchors, which
.I tc2html
disallows.
You can generate a NAME link for a section (e.g., so that you refer
to it using a specific name) as long as you don't write the link like this:
.Ps
\&.H*aname better-html
\&.SH "Generating Better HTML"
\&.H*aend
.Pe
Instead, write it like this:
.Ps
\&.H*aname better-html
\&.H*aend
\&.SH "Generating Better HTML"
.Pe
Unfortunately, some browsers don't seem able to jump to
.Cw NAME
anchors unless there is some text between the
.Cw "<A NAME>"
and
.Cw </A>
tags.
.LP
You can't make a section header a hypertext link.
You'd have to put the header (which generates a NAME link for the TOC)
between the
.Rq H*ahref
and
.Rq H*aend
macros, which would result in nested anchors.
.\"
.Bh "Generating a Table of Contents"
.\"
.LP
Putting a table of contents (TOC) into an HTML document requires some
postprocessing of the
.I tc2html
output.
The TOC entries can't be written to the beginning of the document
because they're not all known until the input has been read entirely.
The approach adopted with
.I tc2html
is as follows:
.Ls B
.Li
Write a marker to the document indicating the desired TOC position.
You do this using a special macro, described below.
.Li
Collect TOC entries in memory as the input is processed.
.Li
Write the TOC contents as a list near the end of the document.
.Li
Run
.I tc2html-toc ,
a script that examines the HTML document and moves the TOC contents
to the location indicated by the TOC position marker.
.Le
If you run
.I tc2html
directly, you must also run
.I tc2html-toc
directly.
If you use
.I troff2html ,
.I tc2html-toc
is run for you automatically.
.LP
The
.Cl html
controls used to generate TOC entries are:
.Ps
\ehtml anchor-toc \f(CIN\fP
\ehtml anchor-end
.Pe
Text occurring between
.Cl html
.B anchor-toc
and
.Cl html
.B anchor-end
pairs is written to the output, but it's also
collected and remembered.
When
.I tc2html
encounters end of file on its input,
it writes the TOC entries to the output between two other HTML comments:
.Ps
<!-- TOC BEGIN -->
\fITOC entries\fP
<!-- TOC END -->
.Pe
If you want to generate a TOC entry explicitly in your
.I troff
source, use
.Rq H*atoc
and
.Rq H*aend .
For example:
.Ps
\&.H*atoc 1
My TOC Entry
\&.H*aend
.Pe
The argument to
.Rq H*atoc
is the TOC entry level (1, 2, 3, ...).
.LP
It's unnecessary to invoke TOC macros directly if
the section-header macros in your macro package are redefined
to invoke the TOC macros for you.
For example, the
.Rq SH
for the
.B \-ms
package is redefined like this in the
.I tc.ms-html
action file:
.Ps
req SH parse-macro-args eol \e
		break fill adjust b \e
		push-string ".H*atoc 1\en" \e
		push-string ".H*header 2\en" \e
		push-string "$1\en" \e
		push-string ".H*header*end\en" \e
		push-string ".H*aend\en"
.Pe
To specify the TOC title and generate the TOC position marker, use the
.Rq H*toc*title
macro.
Invoke it as shown below, passing the title of your TOC as the first argument:
.Ps
\&.H*toc*title "Table of Contents"
.Pe
.Rq H*toc*title
writes the TOC title to the output followed by a special HTML comment:
.Ps
Table of Contents
<!-- INSERT TOC HERE -->
.Pe
The INSERT TOC HERE comment
is used by
.I tc2html-toc ,
along with the TOC BEGIN and TOC END comments,
to find the TOC entries and move them to the desired location.
.LP
Action files that provide macro package redefinitions for
.I tc2html
can try to place an advisory TOC location marker in the document.
This is used if you don't specify a location marker explicitly with
.Rq H*toc*title :
.Ps
<!-- INSERT TOC HERE, MAYBE -->
.Pe
For instance, the
.B \-man
redefinitions put out this marker when the
.Rq TH
macro has been seen.
The marker causes a TOC to be placed
after the title line and the first man page section, unless one is specified
explicitly.
No TOC title is written with the advisory marker however, so the TOC
will be ``title-less.''