1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456
|
{0 Using the Format module}
{1 Principles}
Line breaking is based on three concepts:
{ul {- {b boxes} : a box is a logical pretty-printing unit, which
defines a behaviour of the pretty-printing engine to display the
material inside the box.}
{- {b break hints}: a break hint is a directive to the
pretty-printing engine that proposes to break the line here, if it is
necessary to properly print the rest of the material.
Otherwise, the pretty-printing engine never break lines (except
"in case of emergency" to avoid very bad output).
In short, a break hint tells the pretty printer that a line break here
may be appropriate.}
{- {b indentation rules}:
When a line break occurs, the pretty-printing engines fixes the
indentation (or amount of leading spaces) of the new line using
indentation rules, as follows:
{ul {- A box can state the extra indentation of every new line opened in
its scope. This extra indentation is named
{b box breaking indentation}.}
{- A break hint can also set the additional indentation of the new line
it may fire. This extra indentation is named {b hint breaking
indentation}.}
{- If break hint [bh] fires a new line within box
[b], then the indentation of the new line is simply the sum of:
the current indentation of box [b]
+
the additional box breaking indentation, as defined by box [b]
+
the additional hint breaking indentation, as defined by break
hint [bh].}}}}
{1 Boxes}
There are 4 types of boxes. (The most often used is the "hov" box type, so skip
the rest at first reading).
- {b horizontal box} ({i h} box, as obtained by the
{!Format.open_hbox} procedure): within this box, break hints do not
lead to line breaks.
- {b vertical box} ({i v} box, as obtained by the
{!Format.open_vbox} procedure): within this box, every break hint lead
to a new line.
- {b vertical/horizontal box} ({i hv} box, as obtained by
the {!Format.open_hvbox} procedure): if it is possible, the entire box
is written on a single line; otherwise, every break hint within the box
leads to a new line.
- {b vertical or horizontal box} ({i hov} box, as obtained
by the {!Format.open_box} or {!Format.open_hovbox} procedures): within this
box, break hints are used to cut the line when there is no more room on the
line. There are two kinds of "hov" boxes, you can find the details below.
In first approximation, let me consider these two kinds of "hov" boxes as
equivalent and obtained by calling the {!Format.open_box} procedure.
Let me give an example. Suppose we can write 10 chars before
the right margin (that indicates no more room). We represent any
char as a [-] sign; characters [\[] and [\]]
indicates the opening and closing of a box and [b] stands
for a break hint given to the pretty-printing engine.
The output "--b--b--" is displayed like this (the [b] symbol
stands for the value of the break that is explained below):
Within a "h" box:
{[
--b--b--
]}
Within a "v" box:
{[
--b
--b
--
]}
Within a "hv" box:
If there is enough room to print the box on the line:
{[
--b--b--
]}
But "---b---b---" that cannot fit on the line is written
{[
---b
---b
---
]}
Within a "hov" box:
If there is enough room to print the box on the line:
{[
--b--b--
]}
But if "---b---b---" cannot fit on the line, it is written as
{[
---b---b
---
]}
The first break hint does not lead to a new line, since there is enough room on
the line. The second one leads to a new line since there is no more room to
print the material following it. If the room left on the line were even
shorter, the first break hint may lead to a new line and "---b---b---" is
written as:
{[
---b
---b
---
]}
{1 Printing spaces}
Break hints are also used to output spaces (if the line is not split when the
break is encountered, otherwise the new line indicates properly the separation
between printing items). You output a break hint using [print_break sp indent],
and this sp integer is used to print "sp" spaces. Thus [print_break sp ...] may
be thought as: print [sp] spaces or output a new line.
For instance, if b is [break 1 0] in the output "--b--b--", we get
within a "h" box:
{[
-- -- --
]}
within a "v" box:
{[
--
--
--
]}
within a "hv" box:
{[
-- -- --
]}
or, according to the remaining room on the line:
{[
--
--
--
]}
and similarly for "hov" boxes.
Generally speaking, a printing routine using "format", should not directly
output white spaces: the routine should use break hints instead. (For instance
[print_space ()] that is a convenient abbreviation for [print_break 1 0] and
outputs a single space or break the line.)
{1 Indentation of new lines}
The user gets 2 ways to fix the indentation of new lines:
{b When defining the box}: when you open a box, you can fix the indentation
added to each new line opened within that box.
For instance: [open_hovbox 1] opens a "hov" box with new lines indented 1 more
than the initial indentation of the box. With output "---\[--b--b--b--", we
get:
{[
---\[--b--b
--b--
]}
with open_hovbox 2, we get
{[
---\[--b--b
--b--
]}
Note: the \[ sign in the display is not visible on the screen, it is just there
to materialise the aperture of the pretty-printing box. Last "screen" stands
for:
{[
-----b--b
--b--
]}
{b When defining the break that makes the new line}. As said above, you output
a break hint using [print_break sp indent]. The [indent] integer is used to fix
the additional indentation of the new line. Namely, it is added to the default
indentation offset of the box where the break occurs.
For instance, if \[ stands for the opening of a "hov" box with 1 as extra
indentation (as obtained by [open_hovbox 1]), and b is [print_break 1 2], then
from output "---\[--b--b--b--", we get:
{[
---\[-- --
--
--
]}
{1 Refinement on "hov" boxes}
The "hov" box type is refined into two categories.
- {b the vertical or horizontal {i packing} box} (as obtained by the
{!Format.open_hovbox} procedure): break hints are used to cut the line when
there is no more room on the line; no new line occurs if there is enough room
on the line.
- {b vertical or horizontal {i structural} box} (as obtained by the
{!Format.open_box} procedure): similar to the "hov" packing box, the break
hints are used to cut the line when there is no more room on the line; in
addition, break hints that can show the box structure lead to new lines even if
there is enough room on the current line.
The difference between a packing and a structural "hov" box is shown by a
routine that closes boxes and parentheses at the end of printing: with packing
boxes, the closure of boxes and parentheses do not lead to new lines if there
is enough room on the line, whereas with structural boxes each break hint will
lead to a new line. For instance, when printing
"\[(---\[(----\[(---b)\]b)\]b)\]", where "b" is a break hint without extra
indentation ([print_cut ()]). If "\[" means opening of a packing "hov" box
({!Format.open_hovbox} 1), "\[(---\[(----\[(---b)\]b)\]b)\]" is printed as
follows:
{[
(---
(----
(---)))
]}
If we replace the packing boxes by structural boxes ({!Format.open_box} 1), each
break hint that precedes a closing parenthesis can show the boxes structure, if
it leads to a new line; hence "\[(---\[(----\[(---b)\]b)\]b)\]" is printed like
this:
{[
(---
(----
(---
)
)
)
]}
Each break hint increases indentation by one, both for opening and closing
parenthesis, because the parenthesis are just characters and don't have special
meaning for the formatter.
Notice that the closing parenthesis is not vertically aligned with the opening
parenthesis.
We can line up the opening and closing parenthesis vertically in two ways:
- by using a break hint "b" with negative indentation that matches the opening
indentation of the box `@;<0 -1>`, canceling it.
- by using an additional box, and making all boxes and break hints use
no indentation: "\[(\[---\[(\[----\[(\[---\]b)\]\]b)\]\]b)\]"
In both cases the output would be this:
{[
(---
(----
(---
)
)
)
]}
{1 Practical advice}
When writing a pretty-printing routine, follow these simple rules:
+ Boxes must be opened and closed consistently ([open_*] and
{!Format.close_box} must be nested like parentheses).
+ Never hesitate to open a box.
+ Output many break hints, otherwise the pretty-printer is in a bad situation
where it tries to do its best, which is always "worse than your bad".
+ Break hints with negative indentation should only be used when you are sure
that you don't exceed the indentation of the opening box (e.g. both part of
the same function and an exact match). Exceeding the indentation of the opening
box is not compositional.
+ Do not try to force spacing using explicit spaces in the character strings.
For each space you want in the output emit a break hint ([print_space ()]),
unless you explicitly don't want the line to be broken here. For instance,
imagine you want to pretty print an OCaml definition, more precisely a [let rec
ident = expression] value definition. You will probably treat the first three
spaces as "unbreakable spaces" and write them directly in the string constants
for keywords, and print ["let rec"] before the identifier, and similarly write
[=] to get an unbreakable space after the identifier; in contrast, the space
after the [=] sign is certainly a break hint, since breaking the line after [=]
is a usual (and elegant) way to indent the expression part of a definition. In
short, it is often necessary to print unbreakable spaces; however, most of the
time a space should be considered a break hint.
+ Do not try to force new lines, let the pretty-printer do it for you: that's
its only job. In particular, do not use {!Format.force_newline}: this
procedure effectively leads to a newline, but it also as the unfortunate side
effect to partially reinitialise the pretty-printing engine, so that the rest
of the printing material is noticeably messed up.
+ Never put newline characters directly in the strings to be printed: pretty
printing engine will consider this newline character as any other character
written on the current line and this will completely mess up the output.
Instead of new line characters use line break hints: if those break hints must
always result in new lines, it just means that the surrounding box must be a
vertical box!
+ End your main program by a [print_newline ()] call, that flushes the
pretty-printer tables (hence the output). (Note that the top-level loop of the
interactive system does it as well, just before a new input.)
{1 Printing to stdout: using printf}
The format module provides a general printing facility "a la" printf. In
addition to the usual conversion facility provided by printf, you can write
pretty-printing indications directly inside the format string (opening and
closing boxes, indicating breaking hints, etc).
Pretty-printing annotations are introduced by the [@] symbol, directly into the
string format. Almost any function of the [Format] module can be called from
within a [printf] format string. For instance
- "[@\[]" open a box (open_box 0). You may precise the type as an extra
argument. For instance [@\[<hov n>] is equivalent to [open_hovbox n].
- "[@\]]" close a box ([close_box ()]).
- "[@ ]" output a breakable space ([print_space ()]).
- "[@,]" output a break hint ([print_cut ()]).
- "[@;<n m>]" emit a "full" break hint ([print_break n m]).
- "[@.]" end the pretty-printing, closing all the boxes still opened
([print_newline ()]).
For instance
{v
printf "@\[<1>%s@ =@ %d@ %s@\]@." "Prix TTC" 100 "Euros";;
Prix TTC = 100 Euros
- : unit = ()
v}
{1 A concrete example}
Let me give a full example: the shortest non trivial example you could imagine,
that is the lambda calculus :)
Thus the problem is to pretty-print the values of a concrete data type that
models a language of expressions that defines functions and their applications
to arguments.
First, I give the abstract syntax of lambda-terms:
{v
type lambda =
| Lambda of string * lambda
| Var of string
| Apply of lambda * lambda
;;
v}
I use the format library to print the lambda-terms:
{v
open Format;;
let ident = print_string;;
let kwd = print_string;;
val ident : string -> unit = <fun>
val kwd : string -> unit = <fun>
let rec print_exp0 = function
| Var s -> ident s
| lam -> open_hovbox 1; kwd "("; print_lambda lam; kwd ")"; close_box ()
and print_app = function
| e -> open_hovbox 2; print_other_applications e; close_box ()
and print_other_applications f =
match f with
| Apply (f, arg) -> print_app f; print_space (); print_exp0 arg
| f -> print_exp0 f
and print_lambda = function
| Lambda (s, lam) ->
open_hovbox 1;
kwd "\\"; ident s; kwd "."; print_space(); print_lambda lam;
close_box()
| e -> print_app e;;
val print_app : lambda -> unit = <fun>
val print_other_applications : lambda -> unit = <fun>
val print_lambda : lambda -> unit = <fun>
v}
{2 Most general pretty-printing: using fprintf}
We use the [fprintf] function to write the most versatile version of the
pretty-printing functions for lambda-terms. Now, the functions get an extra
argument, namely a pretty-printing formatter (the ppf argument) where printing
will occur. This way the printing routines are more general, since they can
print on any formatter defined in the program (either printing to a file, or to
[stdout], to [stderr], or even to a string). Furthermore, the pretty-printing
functions are now compositional, since they may be used in conjunction with the
special [%a] conversion, that prints a [fprintf] argument with a user's supplied
function (these user's supplied functions also have a formatter as first
argument).
Using [fprintf], the lambda-terms printing routines can be written as follows:
{v
open Format;;
let ident ppf s = fprintf ppf "%s" s;;
let kwd ppf s = fprintf ppf "%s" s;;
val ident : Format.formatter -> string -> unit
val kwd : Format.formatter -> string -> unit
let rec pr_exp0 ppf = function
| Var s -> fprintf ppf "%a" ident s
| lam -> fprintf ppf "@\[<1>(%a)@\]" pr_lambda lam
and pr_app ppf = function
| e -> fprintf ppf "@\[<2>%a@\]" pr_other_applications e
and pr_other_applications ppf f =
match f with
| Apply (f, arg) -> fprintf ppf "%a@ %a" pr_app f pr_exp0 arg
| f -> pr_exp0 ppf f
and pr_lambda ppf = function
| Lambda (s, lam) ->
fprintf ppf "@\[<1>%a%a%a@ %a@\]" kwd "\\" ident s kwd "." pr_lambda lam
| e -> pr_app ppf e
;;
val pr_app : Format.formatter -> lambda -> unit
val pr_other_applications : Format.formatter -> lambda -> unit
val pr_lambda : Format.formatter -> lambda -> unit
v}
Given those general printing routines, procedures to print to [stdout] or
[stderr] is just a matter of partial application:
{v
let print_lambda = pr_lambda std_formatter;;
let eprint_lambda = pr_lambda err_formatter;;
val print_lambda : lambda -> unit
val eprint_lambda : lambda -> unit
v}
|