File: index.html

package info (click to toggle)
t2html 2016.1020%2Bgit294e8d7-2
  • links: PTS, VCS
  • area: main
  • in suites: bullseye
  • size: 836 kB
  • sloc: perl: 4,711; makefile: 133
file content (633 lines) | stat: -rw-r--r-- 20,538 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<!--
	Note: the LINK tags are used by advanced browsers.
-->
<head>
<title>
Conversion for text files
</title>



<!-- ......................................................................
     META TAGS (FOR SEARCH ENGINES)
     ......................................................................
-->

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  <meta http-equiv="Expires" content="Sat,  1 May 2010 10:09:08 GMT">



  <meta name="Generator"
	content="2010-03-02 12:09 Perl program t2html v2010.0302.0944 http://freecode.com/projects/perl-text2html">



<!-- ......................................................................
     BUTTON DEFINITION START
     ......................................................................
-->

<link rel="stylesheet" type="text/css" href="index.css">
</head>

<body >


<!-- ......................................................................
     TABLE OF CONTENT START
     ......................................................................
-->


<div class="toc">
<a name="toc" id="toc" class="name"></a>
<h1>
    Table of Contents

</h1>

<a href="#document_id" class="toc">
	1.0 Document id
    </a>
    <br>

<ul>
<li>
    <a href="#general" class="toc">
	1.1 General
    </a>
    <br>
<li>
    <a href="#t2html_program_features" class="toc">
	1.2 T2html program features
    </a>
    <br>
<li>
    <a href="#how_to_convert_text_files_into_html" class="toc">
	1.3 How to convert text files into HTML?
    </a>
    <br>
<li>
    <a href="#writing_a_text_document" class="toc">
	1.4 Writing a text document
    </a>
    <br>
<li>
    <a href="#ripping_program_documentation" class="toc">
	1.5 Ripping program documentation
    </a>
    <br>
</ul>
<a href="#other_converters" class="toc">
	2.0 Other converters
    </a>
    <br>

<ul>
<li>
    <a href="#postscript" class="toc">
	2.1 Postscript
    </a>
    <br>
<li>
    <a href="#texinfo" class="toc">
	2.2 Texinfo
    </a>
    <br>
<li>
    <a href="#other_text_to_html_tools" class="toc">
	2.3 Other text to HTML tools
    </a>
    <br>
<li>
    <a href="#other_utilities" class="toc">
	2.4 Other Utilities
    </a>
    <br>
<li>
    <a href="#general_document_maintenance_tools" class="toc">
	2.5 General Document Maintenance tools
    </a>
    <br>
</ul>

</div>


<!-- ......................................................................
     TABLE OF CONTENT END
     ......................................................................
-->

<hr>
	<a name="document_id"  id="document_id"></a>
	<h1>
	1.0 Document id

	</h1>



  <a name="general" id="general"></a>
  <h2>
      1.1 General

  </h2>





<p class="column8">
        Copyright &copy; 1996-2010 Jari Aalto


<p class="column8">
        License: This material may be distributed only subject to
        the terms and conditions set forth in GNU General Public
        License v2 or later; or, at your option, distributed under the
        terms of GNU Free Documentation License version 1.2 or later
        (GNU FDL).



  <a name="t2html_program_features" id="t2html_program_features"></a>
  <h2>
      1.2 T2html program features

  </h2>




<p class="column8">
        Writing text documents is different from writing messages to
        Usenet or to your fellow workers. There already exists several
        tools to convert email messages into HTML, like <em class="word">MHonArc</em>,
        Email hyper archiver, but for regular text documents, like for
        memos, FAQs, help pages and for other papers, there wasn't
        any suitable HTML converter couple of years back. The author
        wanted a simple HTML tool which would read <strong class="word">pure</strong> <strong class="word">plain</strong>
        <strong class="word">text</strong> documents, like guides, tips pages, documentation,
        book mark pages etc. and convert them into HTML. Here you will
        find the specification how to format your text documents for
        <em class="word">t2html.pl</em> perl script text to HTML converter.


<p class="column8">
        Few arguments, why plain text is the best source document format:

<ul>
	<li>It is readable by all, without any extra software
<li>Deliverable by email, as is.
	</li>
<li>Most easily kept in version control
	</li>
<li>Most easily patched ( when someone sends a diff -u ...)
	</li>
<li>Most easily handed to someone else when author no longer
            maintain it. (If you have specialized tools, people
	            need to learn them in order to maintain your FAQ.)</li>
</ul>




<p class="column7"><em><strong>
       1.2.1 Overview of features:</strong></em>


<ul>
	<li>Requires Perl 5.004 or never
<li>500K text document takes 70 seconds to convert to HTML.
	</li>
<li>TF to Perl POD conversion may be in a future plan.
	</li>
<li>Better linking of multiple files planned
	<li>Configuration file for individual file options planned.</li>
</ul>




<p class="column7"><em><strong>
       1.2.2 HTML conversion</strong></em>


<ul>
	<li>minimal mark up: rendering is based on indentation level.
            Written text document looks like a &quot;Natural Document&quot;, and is
            suitable for reading as such.
<li>Text layout with indentation rules is called Technical Format
            (TF) and document must be formatted according to it before it
            is suitable for HTML generation.
	</li>
<li>Rules are simple: place heading to the left and text at column 8.
	</li>
<li>Program generates <em class="word">META</em> tags for search engines.
	</li>
<li>Colored html page: &lt;EM&gt; &lt;STRONG&gt; &lt;PRE&gt; ...
	</li>
<li>Hyperlinks and email addresses are automatically detected.
	            No mark up is needed.</li>
</ul>




<p class="column7"><em><strong>
       1.2.3 HTML 4.01</strong></em>


<ul>
	<li>Make a single html (1 file) or <em class="word">Framed</em> version (3 files)
<li>Sample CSS2 (Cascading Style Sheet) included in HTML code for
	            document rendering. User can import his own CSS2.</li>
</ul>




<p class="column7"><em><strong>
       1.2.4 Link check for the text file</strong></em>


<ul>
	<li>You need LWP module in order to use this feature. (Comes with
            latest Perl)
<li>Program has switches to run Link check on your text file
            to find out any broken or moved link. Currently you
            have to manually fix the links, nut an Emacs mode to do this
            automatically is planned. The output from Link check is standard
	            grep style:  *<a href="FILE:NBR:Error-Description*" >FILE:NBR:Error-Description*</A></li>
</ul>




<p class="column7"><em><strong>
       1.2.5 Splitting the text file to pieces</strong></em>


<ul>
	<li>You can split very large document into pieces, e.g. according
            to top level headings and convert each piece to HTML. This is
            also handy if you're planning to print Slides for a class:
            Split on Headings to individual files: raise the font point
	            and print each file separately.</li>
</ul>





  <a name="how_to_convert_text_files_into_html" id="how_to_convert_text_files_into_html"></a>
  <h2>
      1.3 How to convert text files into HTML?

  </h2>




<p class="column8">
        The TF specification can be found from the <a href="../manual">Manual</a>
        The command used to generate this page was:

<p>
<table class="shade-normal">
    <tr>
    <td class="shade-normal-attrib" valign="top">
    <pre>
      t2html.pl                                                     \
      --author           &quot;Jari Aalto&quot;                               \
      --title            &quot;Conversion for text files&quot;                \
      --html-body         LANG=en                                   \
      --Out                                                         \
      index.txt    </pre></td>
    </tr>
</table>



  <a name="writing_a_text_document" id="writing_a_text_document"></a>
  <h2>
      1.4 Writing a text document

  </h2>




<p class="column8">
        You need nothing else but a text editor where the current column
        number is displayed or editor can be configured to advance your
        TAB by 4 spaces. That's it.
        An Emacs minor mode (See package
        <a href="http://freecode.com/projects/emacs-tiny-tools">tinytf.el</a>) can
        make the writing documents easy. The mode will help formatting
        paragraphs, filling bullets numbering headings and keeping TOC
        up to date.



  <a name="ripping_program_documentation" id="ripping_program_documentation"></a>
  <h2>
      1.5 Ripping program documentation

  </h2>




<p class="column7"><em><strong>
       1.5.1 Documentation tools in programming languages</strong></em>



<p class="column8">
        <em class="word">Perl</em> is an exception within programmin languages, because it
        includes internal documentation syntax called <strong class="word">POD</strong> (Plain Old
        Syntax), with which you can embed documentation right into the
        program source. Deriving the documentation from perl programs
        is a straightforward job. Another well known language
        (invented long after Perl) is Java, which calls the embedded
        documentation <em class="word">javadoc</em>. fro all others, there is need to
        write separate documentation.


<p class="column7"><em><strong>
       1.5.2 Other programming languages</strong></em>



<p class="column8">
        But it is possible to embed documentation inside any
        programming language: directly into the code. A small Perl
        utility can be used to extract the documentation provided it
        was written in TF format. Documentation is put at the
        beginning of the file and updated there. Program <samp class="word">ripdoc.pl</samp>
        extracts the documentation which follows TF guidelines. The
        idea is that you can generate HTML documents from the embedded
        'TF pod'. The conversion goes like this:

<p>
<table class="shade-normal">
    <tr>
    <td class="shade-normal-attrib" valign="top">
    <pre>
      ripdoc.pl code.sh | t2html.pl &gt; code.html
      ripdoc.pl code.el | t2html.pl &gt; code.html
      ripdoc.pl code.cc | t2html.pl &gt; code.html    </pre></td>
    </tr>
</table>


<p class="column8">
        Suitable for awk, shell, sh, ksh, C++, Java, Lisp, python,
        Tcl etc. programming languages. The only criteria is that the language
        supports <em class="word">one-comment-starter</em> and that the documentation has
        been written by using it. Languages that have <em class="word">comment-start</em>
        and <em class="word">comment-end</em>, like C that has /* and */, are not suitable for
        ripdoc.pl.

<hr>
	<a name="other_converters"  id="other_converters"></a>
	<h1>
	2.0 Other converters

	</h1>



  <a name="postscript" id="postscript"></a>
  <h2>
      2.1 Postscript

  </h2>



<ul>
	<li><em class="word">html2ps</em> converter by Jan Karrman's <em><a href="mailto:jan@tdb.uu.se" >jan@tdb.uu.se</A></em> at
            <a href="http://www.tdb.uu.se/~jan/html2ps.html" >http://www.tdb.uu.se/~jan/html2ps.html</A>
<li>html to ps converter
            <a href="http://www.tdb.uu.se/~jan/html2ps.html" >http://www.tdb.uu.se/~jan/html2ps.html</A>
	</li>
<li>html to ps converter by Charlie's Perl at
	            <a href="http://www.antipope.org/charlie/webbook/essays/toolkit.html" >http://www.antipope.org/charlie/webbook/essays/toolkit.html</A></li>
</ul>





  <a name="texinfo" id="texinfo"></a>
  <h2>
      2.2 Texinfo

  </h2>



<ul>
	<li>See page <a href="http://www.fido.de/kama/texinfo/texinfo-en.html" >http://www.fido.de/kama/texinfo/texinfo-en.html</A>
            where you can find C-program <em class="word">html2texinfo</em> program
<li>Perl program <em class="word">html2texi.pl</em>
            <a href="http://www.cs.washington.edu/homes/mernst/software/#html2texi" >http://www.cs.washington.edu/homes/mernst/software/#html2texi</A>
            html2texi converts HTML documentation trees into Texinfo
            format.  Texinfo format can be easily converted to Info format
            (for browsing in Emacs or the stand alone Info browser), to a
            printed manual, or to HTML. Thus, html2texi.pl permits
            conversion of HTML files to Info format, and secondarily
            enables producing printed versions of Web page
            hierarchies. Unlike HTML, Info format is searchable. Since Info
            is integrated into Emacs, one can read documentation without
            starting a separate Web browser. Additionally, Info browsers
            (including Emacs) contain convenient features missing from Web
	            browsers, such as easy index lookup and mouse-free browsing.</li>
</ul>





  <a name="other_text_to_html_tools" id="other_text_to_html_tools"></a>
  <h2>
      2.3 Other text to HTML tools

  </h2>



<ul>
	<li><em class="word">asciidoc</em> Python program to convert text files.
            <a href="http://sourceforge.net/projects/asciidoc" >http://sourceforge.net/projects/asciidoc</A>
<li><em class="word">t2php</em> Implementation in PHP language of the
            technical format. Visit
            <a href="http://rule-project.org/text/en/sw/t2php.txt" >http://rule-project.org/text/en/sw/t2php.txt</A>
	</li>
<li><em class="word">Wiki</em>, a simple text rule mark up.
            <a href="http://c2.com/cgi/wiki?TextFormattingRules" >http://c2.com/cgi/wiki?TextFormattingRules</A>
	</li>
<li><em class="word">Zope</em> A Stuctured text, which seems to rely on indentation
            level as well. The tool has been written in Python language.
            See <a href="http://www.zope.org/Documentation/Articles/STX" >http://www.zope.org/Documentation/Articles/STX</A> and
            <a href="http://www.zope.org/Members/millejoh/structuredText" >http://www.zope.org/Members/millejoh/structuredText</A>
	</li>
<li><em class="word">htmlpp</em> by iMATIX's is at <a href="http://www.imatix.com/" >http://www.imatix.com/</A>. This
            is like C-preprosessor where you have have complex
            and powerful text-markup commands. The base file
	            for html generation is not easily text-readable.
<p>


            See also <a href="http://www.imatix.com/html/gslgen/index.htm" >http://www.imatix.com/html/gslgen/index.htm</A> GSLgen is
            a general-purpose file generator. It generates source code,
            data, or other files from an XML file and a schema file. The
            XML file defines a particular set of data. The schema file
	            tells GSLgen what to do with that data</li>
</ul>



<ul>
	<li><em class="word">No-TagsMarkup</em> by Scott S. Lawton. Another interesting
            plain-text style, similar to TF, is at
            <a href="http://www.prefab.com/ssl/notagsmarkup.html" >http://www.prefab.com/ssl/notagsmarkup.html</A> . Compared to TF,
            this style needs more markup and lacks come of the advanced
            features like Frame/colour/CSS2 support.
<li><em class="word">setext</em> by Ian Feldman's, a simple text markup is available at
            <em><a href="mailto:setext@tidbits.com" >setext@tidbits.com</A></em>
	</li>
<li><em class="word">text2html.pl</em> by Set Golub's Perl script is at
            <a href="http://www.cs.wustl.edu/~seth/txt2html/" >http://www.cs.wustl.edu/~seth/txt2html/</A>. This is a very good tool
            if you want to convert mail message into html quickly. Use it for
            ad hoc things.
	</li>
<li><em class="word">faq2text</em>, A C-code (Unix) based text to HTML converter at
            <a href="http://www.fadden.com/dl-misc/#faq2html" >http://www.fadden.com/dl-misc/#faq2html</A>
	<li><em class="word">faq2html</em> <a href="ftp://ftp.eyrie.org/pub/software/web/faq2html" >ftp://ftp.eyrie.org/pub/software/web/faq2html</A></li>
</ul>





  <a name="other_utilities" id="other_utilities"></a>
  <h2>
      2.4 Other Utilities

  </h2>



<ul>
	<li> <a href="http://www.oreilly.com/catalog/docbook">DocBook - SGML online book</a>
<li> <a href="http://www.mathematik.uni-kl.de/~obachman/Texi2html">Texi2html</a>
             Perl script.
	</li>
<li> <a href="http://www.w3.org/People/Raggett/tidy/">HTML tidy</a>
             remove extra markup.
	</li>
<li> <a href="http://www.physics.purdue.edu/~hinson/ftl">FTL</a>
             Latex like Perl formatting
	</li>
<li> <a href="http://www.cs.ust.hk/~otfried/Hyperlatex/">Hyperlatex</a>
             &quot;Hyperlatex is a package that allows you to prepare documents
             in HTML, and, at the same time, to produce a neatly printed
             document from your input. Unlike some other systems that you
             may have seen, Hyperlatex is not a general LaTeX-to-HTML
             converter. In my eyes, conversion is not a solution to HTML
             authoring. A well written HTML document must differ from a
             printed copy in a number of rather subtle ways. I doubt that
             these differences can be recognized mechanically, and I
             believe that converted LaTeX can never be as readable as a
             document written in HTML.  The basic idea of Hyperlatex is to
             make it possible to write a document that will look like a
             flawless LaTeX document when printed and like a handwritten
	             HTML document when viewed with an HTML browser.&quot;</li>
</ul>



<ul>
	<li> <a href="http://www.cs.washington.edu/homes/mernst/software/#html2texi">html2texi</a>
             &quot;html2texi converts HTML documentation trees into Texinfo format.
             Texinfo format can be easily converted to Info format (for browsing
             in Emacs or the stand alone Info browser), to a printed manual, or
             to HTML. Thus, html2texi.pl permits conversion of HTML files to
             Info format, and secondarily enables producing printed versions of
             Web page hierarchies. Unlike HTML, Info format is searchable. Since
             Info is integrated into Emacs, one can read documentation without
             starting a separate Web browser. Additionally, Info browsers
             (including Emacs) contain convenient features missing from Web
             browsers, such as easy index lookup and mouse-free browsing.&quot;
<li> <a href="http://www.kfa-juelich.de/isr/1/texconv/textopc.html">RTF in PC</a>
	</li>
<li> <a href="http://packages.debian.org/unstable/text/catdoc.html">catdoc</a>
             Viewing MS WORD files.
             Catdoc is simple, one C source file, compiles in any system (DOS;
             Unix). Feed MS word file to it and it gives 7bit text out of it.
	</li>
<li> <a href="ftp://ftp.dante.de:/pub/tex/tools/word2x/">word2x</a>
             Viewing MS WORD files.
	</li>
<li> <a href="http://www.csn.ul.ie/~caolan/docs/MSWordView.html">MSWordView</a>
             &quot;MSWordView is a program that can understand the microsofts word
             8 binary file format (office97), it currently converts word into
             html, which can then be read with a browser.&quot;
	</li>
<li> <a href="http://wwwwbs.cs.tu-berlin.de/~schwartz/pmh/">Laola</a>
             Viewing MS WORD files.
             &quot;Laola(perl) does a respectable job of taking MSWord files to text
             ...LAOLA is giving access to the raw document streams of any program
             using &quot;structured storage&quot; technology to save its documents.
             ELSER is dealing especially with these streams as they are present
	             in Word 6 and Word 7 documents.&quot;</li>
</ul>





  <a name="general_document_maintenance_tools" id="general_document_maintenance_tools"></a>
  <h2>
      2.5 General Document Maintenance tools

  </h2>



<ul>
	<li>Faq maintainer toolset page is at following page:
            <a href="http://www.qucis.queensu.ca/FAQs/FAQaid/" >http://www.qucis.queensu.ca/FAQs/FAQaid/</A> It contains all the
            known tools to make you FAQ maintenance/posting/updating easier
	            in any platform.</li>
</ul>





<!-- ......................................................................
     DOCUMENT END BLOCK
     ......................................................................
-->

<!--


-->
<hr>

<em    class="footer">Html date: 2010-03-02 12:09<br>

</em>

</body>
</html>