1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117
|
<HTML>
<HEAD>
<TITLE>html2latex(1)</TITLE>
</HEAD>
<BODY>
<P>Up to the <A HREF="http://tug.org">TUG homepage</A><BR>
Up to <A HREF="textopc.html">Converters from LaTeX to PC Textprocessors -
Overview</A></P>
<H1>NAME</H1>
<P>html2latex -- convert HTML markup to LaTeX markup </P>
<P>The original author, Nathan Torrington, wrote:<BR>
"The source is available
<A HREF="http://www.vuw.ac.nz/non-local/software/html2latex-0.9c.tar.Z">here</A>",
<BR>
but this is obviously no longer the case. <BR>
Instead, I (W.Hennings) put it <A HREF="bin/html2ltx.zip">here</A>. This
zip-file includes an msdos executable. <BR>
There is another compiled version on CTAN which I also put
<A HREF="bin/html2LaTeX.zip">here</A> (see <A
HREF="html2LaTeX.txt">description</A>), but both result from the same source.
</P>
<P></P>
<H1>SYNOPSIS</H1>
<P><TT>html2latex <I>[opt ...] [file ...]</I></TT> </P>
<H1>DESCRIPTION</H1>
<P>For each file argument, <I>html2latex</I> converts the text as HTML markup
to LaTeX markup. If no files are specified, a usage message is given. Input
will be taken from standard input for files named <EM>-</EM>. Output will to a
similarly named file with a <B>.tex</B> extension (<I>html2latex</I> recognises
<B>.html</B> extensions). </P>
<P>Options modify the action of <I>html2latex</I>. The options are: </P>
<DL>
<DT>-n</DT>
<DD>Number sections. </DD>
<DT>-p</DT>
<DD>Place page breaks after the title page (if present) and the table of
contents (if present). </DD>
<DT>-c</DT>
<DD>Generate a table of contents. </DD>
<DT>-s</DT>
<DD>Create no files -- LaTeX is output to stdout. </DD>
<DT>-t Title</DT>
<DD>Generate a title page, with the title 'Title'. </DD>
<DT>-a Author</DT>
<DD>Generate a title page, with the author 'Author'. </DD>
</DL>
<H1>EXAMPLES</H1>
<P>An example of use is html2latex -n - < file.html | less This converts
<B>file.html</B> to LaTeX and pages through the output. The sections
(corresponding to heading tags in the HTML source) will be numbered. </P>
<P>Another example is html2latex -t 'Introduction to HTML' -a gnat -p -c
html-intro This takes input from the file <B>html-intro</B>, writing to
<B>html-intro.tex</B>, and adds a title page (with title <I>Introduction to
HTML</I> and author <I>gnat</I>) and table of contents with page-breaks after
both. The sections of the document are not numbered. </P>
<H1>BUGS</H1>
<P>Current the only HTML tags supported are: <B>TITLE, H1, H2, H3, H4, H5, H6,
UL, OL, DL, DT, DD, LI, B, I, U, EM, STRONG, CODE, SAMP, KBD, VAR, DFN, CITE,
LISTING</B>. The only recognised SGML escapes are <B>&amp, &lt,
&gt</B>. <B>ADDRESS</B> tags are handled badly. </P>
<P>The <B>COMPACT</B> attribute to a <B>DL</B> tag is not recognised.
<B>MENU</B> and <B>DIR</B> styles are not handled well. <B>TITLE</B> text are
ignored. </P>
<P>Currently <B>PRE</B> tags are not handled at all. </P>
<P>The entire file is read into memory. For long HTML documents on machines
with little memory, this may cause problems. </P>
<H1>CREDITS</H1>
<P>Nathan Torkington adapted the HTML parser from NCSA's Xmosaic package
(<B>file://ncsa.uiuc.edu/Web/xmosaic</B>) and wrote the conversion code. The
HTML parser code is subject to the NCSA restrictions. The conversion code is
subject to the VUW restrictions. Enquiries should be sent via e-mail to
<TT>Nathan.Torkington "at" vuw.ac.nz</TT>. </P>
<P></P>
<HR>
<P>This HTML page is part of the texconv pages.<BR>
Copyright © 1998, 1999, 2000, 2001 Wilfried Hennings<BR>
You may copy and redistribute it under the following conditions:</P>
<UL>
<LI>it must remain intact and the contents unchanged; if you'd like to have
something changed, contact me (W.Hennings "at" fz-juelich.de). Reformatting (e.g.
from HTML to some other presentation format) is granted as long as the contents
are unchanged. </LI>
<LI>you may NOT ask money for it except a reasonable cost for media and
distribution</LI>
</UL>
<P>Please also note the <A HREF="index.html#disclaimer">disclaimer</A>.</P>
</BODY>
</HTML>
|