1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326
|
<!-- manual page source format generated by PolyglotMan v3.2, -->
<!-- available at http://polyglotman.sourceforge.net/ -->
<html>
<head>
<title>PolyglotMan(1) Manual Page</title>
</head>
<body bgcolor='white'>
<a href='#toc'>Table of Contents</a><p>
<h2><a name='sect0' href='#toc0'>Name</a></h2>
PolyglotMan, rman - reverse compile man pages from formatted form to
a number of source formats
<h2><a name='sect1' href='#toc1'>Synopsis</a></h2>
rman [ <i>options </i>] [ <i>file </i>]
<h2><a name='sect2' href='#toc2'>Description</a></h2>
<i>PolyglotMan </i> takes man pages from most of the popular flavors of UNIX
and transforms them into any of a number of text source formats. PolyglotMan
was formerly known as RosettaMan. The name of the binary is still called
<i>rman</i>, for scripts that depend on that name; mnemonically, just think "reverse
man". Previously <i> PolyglotMan </i> required pages to be formatted by nroff(1)
prior to its processing. With version 3.0, it <i>prefers [tn]roff source </i> and
usually produces results that are better yet. And source processing is
the only way to translate tables. Source format translation is not as mature
as formatted, however, so try formatted translation as a backup. <p>
In parsing
[tn]roff source, one could implement an arbitrarily large subset of [tn]roff,
which I did not and will not do, so the results can be off. I did implement
a significant subset of those used in man pages, however, including tbl
(but not eqn), if tests, and general macro definitions, so usually the
results look great. If they don’t, format the page with nroff before sending
it to PolyglotMan. If PolyglotMan doesn’t recognize a key macro used by
a large class of pages, however, e-mail me the source and a uuencoded nroff-formatted
page and I’ll see what I can do. When running PolyglotMan with man page
source that includes or redirects to other [tn]roff source using the .so
(source or inclusion) macro, you should be in the parent directory of
the page, since pages are written with this assumption. For example, if
you are translating /usr/man/man1/ls.1, first cd into /usr/man. <p>
<i>PolyglotMan
</i> accepts man pages from:
<blockquote>
SunOS, Sun Solaris, Hewlett-Packard HP-UX, AT&T
System V, OSF/1 aka Digital UNIX, DEC Ultrix, SGI IRIX, Linux, FreeBSD,
SCO.
</blockquote>
Source processing works for:
<blockquote>
SunOS, Sun Solaris, Hewlett-Packard HP-UX,
AT&T System V, OSF/1 aka Digital UNIX, DEC Ultrix.
</blockquote>
It can produce
<blockquote>
printable
ASCII-only (control characters stripped), section headers-only, Tk, TkMan,
[tn]roff (traditional man page source), SGML, HTML, MIME, LaTeX, LaTeX2e,
RTF, Perl 5 POD.
</blockquote>
A modular architecture permits easy addition of additional
output formats. <p>
The latest version of PolyglotMan is available from <i> <a href='http://polyglotman.sourceforge.net/'>http://polyglotman.sourceforge.net/</a>
</i>.
<h2><a name='sect3' href='#toc3'>Options</a></h2>
The following options should not be used with any others and
exit PolyglotMan without processing any input.
<dl>
<dt>-h|--help </dt>
<dd>Show list of command
line options and exit. </dd>
<dt>-v|--version </dt>
<dd>Show version number and exit. </dd>
</dl>
<p>
<i>You should
specify the filter first, as this sets a number of parameters, and then
specify other options.
<dl>
<dt>-f|--filter <ASCII|roff|TkMan|Tk|Sections|HTML|SGML|MIME|LaTeX|LaTeX2e|RTF|POD>
</i></dt>
<dd>Set the output filter. Defaults to ASCII. </dd>
<dt>-S|--source </dt>
<dd>PolyglotMan tries to
automatically determine whether its input is source or formatted; use
this option to declare source input. </dd>
<dt>-F|--format|--formatted </dt>
<dd>PolyglotMan tries
to automatically determine whether its input is source or formatted; use
this option to declare formatted input. </dd>
<dt>-l|--title <i>printf-string </i> </dt>
<dd>In HTML mode
this sets the <TITLE> of the man pages, given the same parameters as <i>-r </i>.
</dd>
<dt>-r|--reference|--manref <i>printf-string </i> </dt>
<dd>In HTML and SGML modes this sets the URL
form by which to retrieve other man pages. The string can use two supplied
parameters: the man page name and its section. (See the Examples section.)
If the string is null (as if set from a shell by "-r ’’"), ‘-’ or ‘off’, then
man page references will not be HREFs, just set in italics. If your printf
supports XPG3 positions specifier, this can be quite flexible. </dd>
<dt>-V|--volumes
<i><colon-separated list> </i> </dt>
<dd>Set the list of valid volumes to check against when
looking for cross-references to other man pages. Defaults to <i>1:2:3:4:5:6:7:8:9:o:l:n:p
</i>(volume names can be multicharacter). If an non-whitespace string in the
page is immediately followed by a left parenthesis, then one of the valid
volumes, and ends with optional other characters and then a right parenthesis--then
that string is reported as a reference to another manual page. If this
-V string starts with an equals sign, then no optional characters are allowed
between the match to the list of valids and the right parenthesis. (This
option is needed for SCO UNIX.) </dd>
</dl>
<p>
The following options apply only when
formatted pages are given as input. They do not apply to or are always
handled correctly with the source.
<dl>
<dt>-b|--subsections </dt>
<dd>Try to recognize subsection
titles in addition to section titles. This can cause problems on some UNIX
flavors. </dd>
<dt>-K|--nobreak </dt>
<dd>Indicate manual pages don’t have page breaks, so don’t
look for footers and headers around them. (Older nroff -man macros always
put in page breaks, but lately some vendors have realized that printouts
are made through troff(1)
, whereas nroff -man is used to format pages for
reading on screen, and so have eliminated page breaks.) <i>PolyglotMan </i> usually
gets this right even without this flag. </dd>
<dt>-k|--keep </dt>
<dd>Keep headers and footers,
as a canonical report at the end of the page. changeleft Move changebars,
such as those found in the Tcl/Tk manual pages, to the left. --> notaggressive
<i>Disable </i> aggressive man page parsing. Aggressive manual, which is on by
default, page parsing elides headers and footers, identifies sections
and more. --> </dd>
<dt>-n|--name <i>name </i> </dt>
<dd>Set name of man page (used in roff format). If the
filename is given in the form " <i>name </i>. <i>section </i>", the name and section
are automatically determined. If the page is being parsed from [tn]roff
source and it has a .TH line, this information is extracted from that line.
</dd>
<dt>-p|--paragraph </dt>
<dd>paragraph mode toggle. The filter determines whether lines
should be linebroken as they were by nroff, or whether lines should be
flowed together into paragraphs. Mainly for internal use. </dd>
<dt>-s|section <i># </i> </dt>
<dd>Set
volume (aka section) number of man page (used in roff format). tables
Turn on aggressive table parsing. --> </dd>
<dt>-t|--tabstops <i># </i> </dt>
<dd>For those macros sets that
use tabs in place of spaces where possible in order to reduce the number
of characters used, set tabstops every <i># </i> columns. Defaults to 8. </dd>
</dl>
<h2><a name='sect4' href='#toc4'>Notes
on Filter Types</a></h2>
<h3><a name='sect5' href='#toc5'>Roff</a></h3>
Some flavors of UNIX ship man page without [tn]roff
source, making one’s laser printer little more than a laser-powered daisy
wheel. This filter tries to intuit the original [tn]roff directives, which
can then be recompiled by [tn]roff.
<h3><a name='sect6' href='#toc6'>TkMan</a></h3>
TkMan(1)
, a hypertext man page
browser, uses <i>PolyglotMan </i> to show man pages without the (usually) useless
headers and footers on each page. It also collects section and (optionally)
subsection heads for direct access from a pulldown menu. TkMan and Tcl/Tk,
the toolkit in which it’s written, are available via anonymous ftp from
<a href="ftp://ftp.smli.com/pub/tcl/"><i>ftp://ftp.smli.com/pub/tcl/ </i></a>
<h3><a name='sect7' href='#toc7'>Tk</a></h3>
This option outputs the text in a series of
Tcl lists consisting of text-tags pairs, where tag names roughly correspond
to HTML. This output can be inserted into a Tk text widget by doing an
<i> eval <textwidget> insert end <text> </i>. This format should be relatively easily
parsible by other programs that want both the text and the tags. See also
ASCII.
<h3><a name='sect8' href='#toc8'>Ascii</a></h3>
When printed on a line printer, man pages try to produce special
text effects by overstriking characters with themselves (to produce bold)
and underscores (underlining). Other text processing software, such as
text editors, searchers, and indexers, must counteract this. The ASCII
filter strips away this formatting. Piping nroff output through <i>col -b </i>
also strips away this formatting, but it leaves behind unsightly page
headers and footers. Also see Tk.
<h3><a name='sect9' href='#toc9'>Sections</a></h3>
Dumps section and (optionally)
subsection titles. This might be useful for another program that processes
man pages.
<h3><a name='sect10' href='#toc10'>HTML</a></h3>
With a simple extention to a HTTP server for Mosaic(1)
or
other World Wide Web browser, <i>PolyglotMan </i> can produce high quality HTML
on the fly. Several such extensions and pointers to several others are
included in <i>PolyglotMan </i>’s <i>contrib </i> directory.
<h3><a name='sect11' href='#toc11'>Sgml</a></h3>
This is appoaching the
Docbook DTD, but I’m hoping that someone with a real interest in this will
polish the tags generated. Try it to see how close the tags are now.
<h3><a name='sect12' href='#toc12'>MIME</a></h3>
MIME
(Multipurpose Internet Mail Extensions) as defined by RFC 1563, good for
consumption by MIME-aware e-mailers or as Emacs (>=19.29) enriched documents.
<h3><a name='sect13' href='#toc13'>LaTeX and LaTeX2e</a></h3>
Why not?
<h3><a name='sect14' href='#toc14'>Rtf</a></h3>
Use output on Mac or NeXT or whatever. Maybe
take random man pages and integrate them better with NeXT’s documentation
system. Maybe NeXT has its own man page macros that do this.
<h3><a name='sect15' href='#toc15'>PostScript
and FrameMaker</a></h3>
To produce PostScript, use <i>groff </i> or <i>psroff </i>. To produce
FrameMaker MIF, use FrameMaker’s builtin filter. In both cases you need
<i>[tn]roff </i> source, so if you only have a formatted version of the manual
page, use <i>PolyglotMan </i>’s roff filter first.
<h2><a name='sect16' href='#toc16'>Examples</a></h2>
To convert the <i>formatted
</i> man page named <i>ls.1 </i> back into [tn]roff source form: <p>
<i>rman -f roff /usr/local/man/cat1/ls.1
> /usr/local/man/man1/ls.1 </i> <br>
<p>
Long man pages are often compressed to conserve space (compression is
especially effective on formatted man pages as many of the characters
are spaces). As it is a long man page, it probably has subsections, which
we try to separate out (some macro sets don’t distinguish subsections well
enough for <i>PolyglotMan </i> to detect them). Let’s convert this to LaTeX format:
<br>
<p>
<i>pcat /usr/catman/a_man/cat1/automount.z | rman -b -n automount -s 1 -f latex
> automount.man </i> <br>
<p>
Alternatively, <i>man 1 automount | rman -b -n automount -s 1 -f latex > automount.man
</i> <br>
<p>
For HTML/Mosaic users, <i>PolyglotMan </i> can, without modification of the source
code, produce HTML links that point to other HTML man pages either pregenerated
or generated on the fly. First let’s assume pregenerated HTML versions of
man pages stored in <i>/usr/man/html </i>. Generate these one-by-one with the following
form: <br>
<i>rman -f html -r ’http:/usr/man/html/%s.%s.html’
/usr/man/cat1/ls.1 > /usr/man/html/ls.1.html
</i> <br>
<p>
If you’ve extended your HTML client to generate HTML on the fly you should
use something like: <br>
<i>rman -f html -r ’http:~/bin/man2html?%s:%s’
/usr/man/cat1/ls.1 </i> <br>
when generating HTML.
<h2><a name='sect17' href='#toc17'>Bugs/Incompatibilities</a></h2>
<i>PolyglotMan </i> is not perfect
in all cases, but it usually does a good job, and in any case reduces
the problem of converting man pages to light editing. <p>
Tables in formatted
pages, especially H-P’s, aren’t handled very well. Be sure to pass in source
for the page to recognize tables. <p>
The man pager <i>woman</i>(1)
applies its own
idea of formatting for man pages, which can confuse <i>PolyglotMan </i>. Bypass
<i> woman </i> by passing the formatted manual page text directly into <i>PolyglotMan
</i>. <p>
The [tn]roff output format uses fB to turn on boldface. If your macro
set requires .B, you’ll have to a postprocess the <i>PolyglotMan </i> output.
<h2><a name='sect18' href='#toc18'>See
Also</a></h2>
<i>tkman(1)
</i>, <i>xman(1)
</i>, <i>man(1)
</i>, <i>man(7)
</i> or <i>man(5)
</i> depending on your
flavor of UNIX
<h2><a name='sect19' href='#toc19'>Author</a></h2>
PolyglotMan <br>
by Thomas A. Phelps ( <i>phelps@ACM.org </i>) <br>
developed at the <br>
University of California, Berkeley <br>
Computer Science Division <p>
Manual page last updated on $Date: 1998/07/13
09:47:28 $ (with text patch for Debian) <p>
<hr><p>
<a name='toc'><b>Table of Contents</b></a><p>
<ul>
<li><a name='toc0' href='#sect0'>Name</a></li>
<li><a name='toc1' href='#sect1'>Synopsis</a></li>
<li><a name='toc2' href='#sect2'>Description</a></li>
<li><a name='toc3' href='#sect3'>Options</a></li>
<li><a name='toc4' href='#sect4'>Notes on Filter Types</a></li>
<ul>
<li><a name='toc5' href='#sect5'>Roff</a></li>
<li><a name='toc6' href='#sect6'>TkMan</a></li>
<li><a name='toc7' href='#sect7'>Tk</a></li>
<li><a name='toc8' href='#sect8'>Ascii</a></li>
<li><a name='toc9' href='#sect9'>Sections</a></li>
<li><a name='toc10' href='#sect10'>HTML</a></li>
<li><a name='toc11' href='#sect11'>Sgml</a></li>
<li><a name='toc12' href='#sect12'>MIME</a></li>
<li><a name='toc13' href='#sect13'>LaTeX and LaTeX2e</a></li>
<li><a name='toc14' href='#sect14'>Rtf</a></li>
<li><a name='toc15' href='#sect15'>PostScript and FrameMaker</a></li>
</ul>
<li><a name='toc16' href='#sect16'>Examples</a></li>
<li><a name='toc17' href='#sect17'>Bugs/Incompatibilities</a></li>
<li><a name='toc18' href='#sect18'>See Also</a></li>
<li><a name='toc19' href='#sect19'>Author</a></li>
</ul>
</body>
</html>
|