1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726
|
WP2X(1) WP2X(1)
NNAAMMEE
wp2x - A WordPerfect 5.0 to whatever converter
SSYYNNOOPPSSIISS
wwpp22xx [ --ss ] [ --vv ] [ --nn_b_l_i_p ] configfile wpfile
DDEESSCCRRIIPPTTIIOONN
_W_p_2_x is intended to convert _s_i_m_p_l_e files stored in Word
Perfect 5.1 format into any other document processing lan
guage that uses plain text files. Examples include TeX,
LaTeX, troff, GML and HTML.
_W_p_2_x reads a configuration file and a WordPerfect 5.1
input file, and uses the information in them to produce an
output file, which is sent to stdout. If the configura
tion file cannot be found, a suffix of _._c_f_g is appended.
The current directory is searched, as well as the lib
directory specified by the _W_P_2_X___D_I_R variable in the Make
file. (Usually _/_u_s_r_/_l_o_c_a_l_/_l_i_b_/_w_p_2_x ) and the directories
specified by the environment variables _P_A_T_H _, _D_P_A_T_H _, and
_W_P_2_X_L_I_B _.
Some codes are not translated because documents that
require these codes typically would require significant
hand-editing. Hence, there's no point in trying to emu
late something you're going to delete anyway. (Remember,
_w_p_2_x is not intended to be used as an automated conversion
program. Rather, it is intended to be used as a single
step in the document conversion process, which gets most
of the the grunt work of conversion done and out of the
way, so that you can concentrate your efforts on convert
ing the trickier parts of the document. The object of the
game is to produce a readable conversion, rather than a
perfect conversion.)
As the program runs, a dot is printed to _s_t_d_e_r_r for every
1024 characters converted. This can be suppressed with
the --ss switch, and the interval between dots can be
changed with the --nn switch.
OOPPTTIIOONNSS
--ss Suppresses all non-error output to _s_t_d_e_r_r, includ
ing the _t_y_p_e_o_u_t banner, the progress dots, and
warnings about undefined expansions.
--nn_b_l_i_p Every _b_l_i_p tokens, a dot is emitted to _s_t_d_e_r_r,
unless the --ss switch is given. The value _b_l_i_p must
appear imediately following the --nn without an
intervening space. If no --nn switch is supplied,
then a value of 1024 is assumed.
--vv prints the version number and the program usage.
1
WP2X(1) WP2X(1)
UUSSAAGGEE
The configuration file controls how the file is converted
from WordPerfect 5.1 format. Each line of the configura
tion file is of the form
identifier="list of codes"
where the list of codes is a string which will be placed
in the output stream whenever the corresponding WordPer
fect code is encountered. Standard C-style backslash-
escape sequences are recognized, as well as \xFF for hex
values. You do not have to backslash-protect a newline.
Some identifiers supply replacable parameters, which can
be interpolated as follows:
%1 interpolate first parameter as a decimal integer.
%2 interpolate second parameter as a decimal integer.
%c interpolate first parameter as an ASCII character.
%\n interpolate a newline if the most-recently-output
character was not already a newline. (The _\_n can
be either the C-style escape sequence, or an actual
newline character.) Use this if the expansion must
take place at the beginning of a line. (For exam
ple, _t_r_o_f_f control characters must appear as the
first character in the line in order to take
effect.) This sequence is meaningful only at the
beginning of the string; if it appears elsewhere,
it is flagged as erroneous.
%% interpolate a percent-sign.
A percent sign followed by any other character is consid
ered an error. It is also an error to interpolate a
parameter that is not applicable to the identifier being
defined. You may interpolate the parameters as many times
as, and in whatever order, you wish. (With the exception
of the _%_\_n code.)
Here follows a list of the accepted identifiers. In the
discussion, `%1' represents the first parameter, and `%2'
the second. Remember that the character version of %1 is
available as `%c'.
BEGIN Expanded at the beginning of the file.
END Expanded at the end of the file.
COMMENT Expanded when wp2x needs to insert a com
ment into the output. The comment is
passed as %s.
PageNo Insert current page number
RomanPage Set page number to %1, and set roman-
2
WP2X(1) WP2X(1)
numeral mode
ArabicPage Set page number to %1, and set arabic-
numeral mode
Tab What to do when you see a tab character.
BeginTabs Emitted when tab settings are about to
change. The BBeeggiinnTTaabbss code should delete
all existing tabs and prepare for new tab
settings to start. All tab values are
given in columns measured from the physical
left edge of the paper. (Not from the left
margin.)
SetTab Set a normal (left-justified) tabstop at
column %1.
SetTabCenter Set a centered tabstop at column %1.
SetTabRight Set a right-justified tabstop at column %1.
SetTabDecimal Set a decimal tab at column %1.
EndTabs Finish the setting of tabstops.
For example, if the WordPerfect file contains a code that
says `Set new tabstops as follows: Regular tab at column
15, a centered tab at column 40, a right-justified tab at
column 59, and a regular tab at column 60', then the fol
lowing expansions are made in succession:
BeginTabs
SetTab(15)
SetTabCenter(40)
SetTabright(59)
SetTab(60)
EndTabs
HSpace Hard (nonbreakable) space.
HPg Hard page break.
CondEOP
Force a new page if fewer than %1 half-lines remain
on current page.
HRt Hard return.
SRt Soft return.
- Breakable hyphen.
-- Breakable hyphen, appearing at the end of a line.
= Non-breakable hyphen.
\- Discretionary hyphen.
\-- Discretionary hyphen, appearing at the end of a
line.
Marg Set left margin at %1 characters and right margin
at %2 characters.
TopMargin
Set top margin to %1 lines.
PageLength
Set page length to %1 lines.
3
WP2X(1) WP2X(1)
SS Single spacing.
DS Double spacing.
1.5S One-and-a-half spacing.
TS Triple spacing.
LS Other line spacing. %1 is twice the desired spac
ing. (For example, a request for 2.5-spacing sets
%1=5.)
LPI Set %1 lines per inch (%1 is either 6 or 8)
Bold Begin boldface
bold End boldface
Und Begin underline
und End underline
DoubleUnd
Begin double underline
doubleund
End double underline
Red Begin redline
red End redline
Strike Begin strikeout
strike End strikeout
Rev Begin reverse video
rev End reverse video
Outline
Begin outline text
outline
End outline text
Fine Begin fine font size
fine End fine font size
Over Begin overstrike font
over End overstrike font
Sup Begin superscript
sup End superscript
Sub Begin subscript
sub End subscript
Large Begin large font size
large End large font size
Small Begin small font size
small End small font size
VeryLarge
Begin very large font size
verylarge
End very large font size
ExtraLarge
Begin extra large font size
extralarge
End extra large font size
Italics
Begin an italics font
italics
End an italics font
Shadow Begin shadow font
shadow End shadow font
4
WP2X(1) WP2X(1)
SmallCaps
Begin small capitals font (fixed width)
smallcaps
End small capitals font (fixed width)
UpHalfLine
Advance printer up 1/2 line
DownHalfLine
Advance printer down 1/2 line
AdvanceToHalfLine
Advance to absolute vertical position. %1 is what
WordPerfect thinks the current vertical page posi
tion is, in half-lines. %2 is the desired posi
tion, also in half-lines.
Indent Expanded when an "Indent" code appears.
indent Expanded at the end of an indented paragraph.
DIndent
Expanded when a "left-and-right-indent" code
appears.
dindent
Expanded at the end of an double indent
MarginRelease
Margin release. %1 is the number of characters to
move left.
Center Center current line
center End centering
CenterHere
Center line around current column
centerhere
End centering
Align Begin alignment
align End alignment
AlignChar
Set alignment character
FlushRight
Begin flush right
flushright
End flush right
Math Begin math mode
math End math mode
MathCalc
Begin math calc mode
MathCalcColumn
Math calc column
SubTotal
Do subtotal
IsSubTotal
Subtotal entry
Total Do total
5
WP2X(1) WP2X(1)
IsTotal
Total entry
GrandTotal
Do grand total
Col Begin column mode
col End column mode
Fn Expanded at the beginning of a footnote.
fn Expanded at the end of a footnote.
En Expanded at the beginning of an endnote.
en Expanded at the end of an endnote.
SetFn# Set the number for the next footnote to %1.
FNote# Footnote number.
ENote# Endnote number.
Figure#
Figure number.
TableMarker
Insert table of contents here
Hyph Enable hyphenation.
hyph Disable hyphenation.
Just Enable justification.
just Disable justification.
Wid Enable widow/orphan protection.
wid Disable widow/orphan protection.
HZone The hyphenation zone. %1 and %2 are the two magi
cal values that WordPerfect uses to control hyphen
ation.
DAlign Set the decimal alignment character to that whose
ASCII value is %1. (`%c' is useful here.)
Header Begin header text
header End header text
Footer Begin footer text
footer End footer text
Supp Suppress page number/header/footer information for
one page. %1 argument is a bit field which
describes what sort of suppression is desired.
Here's what the bits mean:
1 = all
2 = page number
4 = page numbers moved to bottom
8 = all headers
16 = header a
32 = header b
64 = footer a
128 = footer b
CtrPg Center page vertically
SetFont
Change pitch or font. %1 is the desired pitch.
(Negative means proportionally-spaced.) %2 is the
6
WP2X(1) WP2X(1)
font number.
SetBin Select paper bin to %1 = 0, 1, ...
PN0 No page numbering.
PN1 Page number in top left.
PN2 Page number in top center.
PN3 Page number in top right.
PN4 Page number on top outside corners (even/odd).
PN5 Page number in lower left.
PN6 Page number in bottom center.
PN7 Page number in lower right.
PN8 Page number on bottom outside corners (even/odd).
If no expansion is supplied for an identifier, then noth
ing is emitted to _s_t_d_o_u_t, but a warning message is sent to
_s_t_d_e_r_r. This warning message will appear at most once per
identifier, and it can be suppressed completely by the --ss
option.
The special identifier _t_y_p_e_o_u_t causes its replacement text
to be displayed on the screen every time the configuration
file is read. This is useful for identification messages,
or reminders to the user.
A special identifier is any character enclosed in single
quotation marks, which represent themselves. For example,
'a'="{\\alpha}"
causes the string "{\alpha}" to be emitted when an a is
encountered. This could also have been written as
'\xE0'="{\\alpha}"
if the character a has ASCII value 0xE0. (Which is true
for the IBM PC encoding.)
If no definition exists for a particular special charac
ter, it is transmitted undisturbed. If a special charac
ter is encountered from the upper half of the ASCII char
acter set, and if it has no definition, then a warning
message is also emitted. (Which can be suppressed with
the --ss option.)
Lines beginning with the # character are comments.
NNOOTTEESS
This is based on an original WP 4.2 to anything transla
tor. The file format has changed a lot between 4.2 and
5.0. This translator no longer reads WP 4.2 files,
although it could be extended to do so.
The 5.0+ format starts with a standard header file. There
is a four byte magic number at the head of the file,
7
WP2X(1) WP2X(1)
followed by various product and version information. All
WordPerfect Corporation utilities use this standard
header. See the WPproducts array in _w_p_2_x_._c
Once the contents of the file have been located, there are
three kinds of codes: simple one byte controls (WP 4.2 had
only these kinds), fixed length controls, and variable
length controls. There are a large number of undefined
types defined for future use. If wwpp22xx detects something it
doesn't understand, it can extract the length and skip
that code. There are a number of defined codes that are
unimplemented. Please see the code, specifically _t_o_k_e_n_s_._c
where much of the input processing is done.
FFIILLEESS
The sample configuration files in _/_u_s_r_/_l_o_c_a_l_/_l_i_b_/_w_p_2_x give
you some sort of idea what a `production quality' configu
ration file might look like. They are not intended to be
used as-is, but rather are meant to be modified to suit
your particular needs.
SSEEEE AALLSSOO
_t_e_x(1), _l_a_t_e_x(1), _n_r_o_f_f(1), _t_r_o_f_f(1), _W_o_r_d_P_e_r_f_e_c_tDevel
oper's_T_o_o_l_k_i_t _g_e_t_o_p_t_(_3_)_.
DDIIAAGGNNOOSSTTIICCSS
Ignoring byte [XX]
Indicates that an unimplemented single byte code
was ignored.
Ignoring fixed [XX]
Indicates that an unimplemented fixed length code
was ignored.
Ignoring variable [XX] sub [XX] length
Indicates that an unimplemented variable length
code was ignored, and gives its length.
Warning: Expected XX but received XX at pos: YYYY
something is wrong in the input file at byte YYYY.
Warning: No expansion for XX (C)
A WP code for which no expansion was defined in the
config file was encountered.
Internal error: Invalid escape C
An error occured while processing an expansion
escape (%x substitution). Probably it was not a
recognized escape, check the config file.
Fixed Length block [XX] incorrectly terminated by [YY] as
pos Z
Something is wrong with the input file, a fixed
8
WP2X(1) WP2X(1)
length block was screwed up.
Reserved code [XX] seen
Something that WPC defined as reserved was seen.
Check with WPC for new meaning.
Not a recognized file type. The file did not start with
the right WPC
magic number. Maybe this is a 4.2 file, or not a
WordPerfect file at all?
Error: Cannot open X (reason)
The file X could not be opened, for the indicated
reason.
Error: Expecting a hex digit
Inside a string, you typed the characters `\x', but
the next character was not a valid hex digit.
Error: string pool overflow
The configuration file contained too many strings.
Increase the value of POOL_SIZE and recompile.
Error: Unknown identifier X
The word X was encountered in the configuration
file when _w_p_2_x expected a token identifier like
`HRt'. Most likely, you either misspelled it, or
you got your quotation marks out of sync.
Error: Identifier not followed by = sign
After an identifier must come an equals-sign.
Error: Quotation mark expected
After the equals-sign must come a quotation mark.
Error: X: `%\n' not at start of expansion
The expansion for the identifier X contained the
indicated sequence of characters somewhere other
than the beginning of the string. The `%\n' inter
polation code is meaningful only at the beginning
of a string.
Error: X: invalid escape `%x'
The expansion for the identifier X contained an
invalid escape. Either you used `%1', `%2' or `%c'
when the identifier X does not supply that parame
ter, or you meant for a genuine percent sign to be
output, in which case you should put `%%' in the
expansion.
Error: Invalid character identifier
Character identifiers can only be one character
long (after backslash interpretation).
9
WP2X(1) WP2X(1)
Warning: Expected XX but received YY.
The program expected the next byte from the WP file
to be XX, but the byte YY was encountered instead.
This means either that your WP file is damaged, or
that the program is seriously confused. (Or both.)
The program will pretend that the byte in the file
was indeed XX, which may lead to synchronization
errors later on.
Warning: No expansion for X
The WP file contained the token X, but the configu
ration file did not contain any expansion text for
it. A null expansion was assumed.
Warning: No expansion for XX (c)
The WP file contained the character c (hex code
XX), but the configuration file did not contain any
expansion text for it. The character was emitted
unaltered. Beware that this may give your text
formatter indigestion if it does not handle eight-
bit characters.
Warning: X code not supported
The file being converted uses a code which _w_p_2_x
does not know how to convert. A comment is placed
in the output file in its place. If you ever
encounter a `WPCorp reserved' or a `WPCorp unde
fined' code, the author would appreciate hearing
from you.
Internal error: Invalid escape, %x
While processing text, _w_p_2_x noticed that you used
an invalid escape. Nothing is emitted as the
escape text. (The internal-ness is that this error
is supposed to be caught at the time the configura
tion file is read.)
BBUUGGSS
Naive configuration files will fail if your WP file
doesn't nest its tags properly. A typical case is
[Center][B]Hello[center]
[Center]There[b][center]
to produce a centered boldface `Hello'. If you use the
naive encoding of
Center="\\centerline{"
center="}\n"
Bold="{\\bf "
bold="}"
then this will expand to
\centerline{{\bf Hello}
10
WP2X(1) WP2X(1)
\centerline{There}}
WordPerfect has no clean concept of grouping; it lets you
change fonts at any time and let those changes propagate
outside the current environment. (With the exception of
headers, footers, footnotes, and endnotes.)
Now sure, you could write complicated configuration
strings to try to handle this `properly', but it'd proba
bly not be worth the trouble. After all, the purpose is
not to perform a perfect conversion, but rather to produce
a _r_e_a_d_a_b_l_e conversion, which can then be massaged by hand
to produce a perfect manuscript.
Another potential problem is combined attributes, like
boldface underline. Under a naive configuration,
[B]Boldface [U]Underlined boldface[b] Underlined[u] normal.
comes out as
{\bf Boldface {\it Underlined boldface} Underlined\/} normal.
which is wrong for two reasons. One is the nesting prob
lem discussed above. The other is that TeX font
attributes do not combine.
Similar problems exist for other document preparation sys
tems. So be careful.
AAUUTTHHOORR
Original author: Raymond Chen <raymond@math.berkeley.edu>
Current maintainer: Michael Richardson <mcr@ccs.car
leton.ca>
11
|