dwww Home | Manual pages | Find package

Format_tutorial(3o)              OCaml library             Format_tutorial(3o)

NAME
       Format_tutorial -  Principles

Module
       Module   Format_tutorial

Documentation
       Module Format_tutorial
        : sig end

   Principles
       Line breaking is based on three concepts:

       -boxes  :  a box is a logical pretty-printing unit, which defines a be-
       haviour of the pretty-printing engine to display  the  material  inside
       the box.

       -break hints: a break hint is a directive to the pretty-printing engine
       that proposes to break the line here, if it is  necessary  to  properly
       print  the rest of the material.  Otherwise, the pretty-printing engine
       never break lines (except "in case of emergency" to avoid very bad out-
       put).   In  short,  a  break  hint tells the pretty printer that a line
       break here may be appropriate.

       -indentation rules: When a line break occurs, the  pretty-printing  en-
       gines  fixes  the  indentation (or amount of leading spaces) of the new
       line using indentation rules, as follows:

       -A box can state the extra indentation of every new line opened in  its
       scope. This extra indentation is named box breaking indentation.

       -A  break  hint can also set the additional indentation of the new line
       it may fire. This extra indentation is named hint breaking indentation.

       - If break hint bh fires a new line within box b , then the indentation
       of  the new line is simply the sum of: the current indentation of box b
       + the additional box breaking indentation, as defined by box  b  +  the
       additional hint breaking indentation, as defined by break hint bh .

   Boxes
       There are 4 types of boxes. (The most often used is the "hov" box type,
       so skip the rest at first reading).

       -horizontal box (h box, as obtained by the open_hbox procedure): within
       this box, break hints do not lead to line breaks.

       -vertical  box  (v box, as obtained by the open_vbox procedure): within
       this box, every break hint lead to a new line.

       -vertical/horizontal box (hv box, as obtained by the open_hvbox  proce-
       dure):  if  it is possible, the entire box is written on a single line;
       otherwise, every break hint within the box leads to a new line.

       -vertical or horizontal box (hov box, as obtained by  the  open_box  or
       open_hovbox  procedures):  within this box, break hints are used to cut
       the line when there is no more room on the line.  There are  two  kinds
       of "hov" boxes, you can find the details below. In first approximation,
       let me consider these two kinds of "hov" boxes as  equivalent  and  ob-
       tained by calling the open_box procedure.

       Let  me give an example. Suppose we can write 10 chars before the right
       margin (that indicates no more room). We represent  any  char  as  a  -
       sign; characters [ and ] indicates the opening and closing of a box and
       b stands for a break hint given to the pretty-printing engine.

       The output "--b--b--" is displayed like this (the b symbol  stands  for
       the value of the break that is explained below):

       Within a "h" box:

       --b--b--

       Within a "v" box:

       --b
       --b
       --

       Within a "hv" box:

       If there is enough room to print the box on the line:

       --b--b--

       But "---b---b---" that cannot fit on the line is written

       ---b
       ---b
       ---

       Within a "hov" box:

       If there is enough room to print the box on the line:

       --b--b--

       But if "---b---b---" cannot fit on the line, it is written as

       ---b---b
       ---

       The first break hint does not lead to a new line, since there is enough
       room on the line. The second one leads to a new line since there is  no
       more  room  to print the material following it. If the room left on the
       line were even shorter, the first break hint may lead to a new line and
       "---b---b---" is written as:

       ---b
       ---b
       ---

   Printing spaces
       Break  hints  are  also used to output spaces (if the line is not split
       when the break is encountered, otherwise the new line  indicates  prop-
       erly  the  separation  between printing items). You output a break hint
       using print_break sp indent , and this sp integer is used to print "sp"
       spaces.  Thus print_break sp ...  may be thought as: print sp spaces or
       output a new line.

       For instance, if b is break 1 0 in the output "--b--b--", we get

       within a "h" box:

       -- -- --

       within a "v" box:

       --
       --
       --

       within a "hv" box:
       -- -- --

       or, according to the remaining room on the line:

       --
       --
       --

       and similarly for "hov" boxes.

       Generally speaking, a printing routine using "format", should  not  di-
       rectly output white spaces: the routine should use break hints instead.
       (For instance print_space () that  is  a  convenient  abbreviation  for
       print_break 1 0 and outputs a single space or break the line.)

   Indentation of new lines
       The user gets 2 ways to fix the indentation of new lines:

       When defining the box: when you open a box, you can fix the indentation
       added to each new line opened within that box.

       For instance: open_hovbox 1 opens a "hov" box with new lines indented 1
       more   than   the   initial   indentation   of  the  box.  With  output
       "---[--b--b--b--", we get:

       ---[--b--b
            --b--

       with open_hovbox 2, we get

       ---[--b--b
             --b--

       Note: the [ sign in the display is not visible on  the  screen,  it  is
       just there to materialise the aperture of the pretty-printing box. Last
       "screen" stands for:

       -----b--b
            --b--

       When defining the break that makes the new line.  As  said  above,  you
       output a break hint using print_break sp indent . The indent integer is
       used to fix the additional indentation of the new line. Namely,  it  is
       added  to the default indentation offset of the box where the break oc-
       curs.

       For instance, if [ stands for the opening of a "hov" box with 1 as  ex-
       tra indentation (as obtained by open_hovbox 1 ), and b is print_break 1
       2 , then from output "---[--b--b--b--", we get:

          ---[-- --
                --
                --

   Refinement on hov boxes
       The "hov" box type is refined into two categories.

       -the vertical or horizontal packing box (as obtained by the open_hovbox
       procedure):  break hints are used to cut the line when there is no more
       room on the line; no new line occurs if there is  enough  room  on  the
       line.

       -vertical  or  horizontal  structural  box (as obtained by the open_box
       procedure): similar to the "hov" packing box, the break hints are  used
       to  cut  the  line when there is no more room on the line; in addition,
       break hints that can show the box structure lead to new lines  even  if
       there is enough room on the current line.

       The difference between a packing and a structural "hov" box is shown by
       a routine that closes boxes and parentheses at  the  end  of  printing:
       with packing boxes, the closure of boxes and parentheses do not lead to
       new lines if there is enough room on the line, whereas with  structural
       boxes  each  break  hint  will  lead  to a new line. For instance, when
       printing "[(---[(----[(---b)]b)]b)]", where "b" is a break hint without
       extra  indentation  ( print_cut () ). If "[" means opening of a packing
       "hov" box ( open_hovbox ), "[(---[(----[(---b)]b)]b)]"  is  printed  as
       follows:

       (---
        (----
         (---)))

       If  we replace the packing boxes by structural boxes ( open_box ), each
       break hint that precedes a  closing  parenthesis  can  show  the  boxes
       structure, if it leads to a new line; hence "[(---[(----[(---b)]b)]b)]"
       is printed like this:

       (---
        (----
         (---
         )
        )
       )

   Practical advice
       When writing a pretty-printing routine, follow these simple rules:

       -Boxes must be opened and closed consistently (  open_*  and  close_box
       must be nested like parentheses).

       -Never hesitate to open a box.

       -Output many break hints, otherwise the pretty-printer is in a bad sit-
       uation where it tries to do its best, which is always "worse than  your
       bad".

       -Do  not  try  to  force spacing using explicit spaces in the character
       strings.  For each space you want in the output emit  a  break  hint  (
       print_space  () ), unless you explicitly don't want the line to be bro-
       ken here. For instance, imagine you want to pretty print an OCaml defi-
       nition, more precisely a let rec
       ident  = expression value definition. You will probably treat the first
       three spaces as "unbreakable spaces" and write  them  directly  in  the
       string  constants  for keywords, and print "let rec" before the identi-
       fier, and similarly write = to get an unbreakable space after the iden-
       tifier;  in  contrast,  the space after the = sign is certainly a break
       hint, since breaking the line after = is a usual (and elegant)  way  to
       indent the expression part of a definition.  In short, it is often nec-
       essary to print unbreakable spaces; however, most of the time  a  space
       should be considered a break hint.

       -Do  not  try to force new lines, let the pretty-printer do it for you:
       that's its only job.  In particular, do not use  force_newline  :  this
       procedure  effectively  leads to a newline, but it also as the unfortu-
       nate side effect to partially reinitialise the pretty-printing  engine,
       so that the rest of the printing material is noticeably messed up.

       -Never  put  newline  characters directly in the strings to be printed:
       pretty printing engine will consider  this  newline  character  as  any
       other  character  written  on the current line and this will completely
       mess up the output.  Instead of new  line  characters  use  line  break
       hints:  if  those  break hints must always result in new lines, it just
       means that the surrounding box must be a vertical box!

       -End your main program by a print_newline ()  call,  that  flushes  the
       pretty-printer  tables  (hence  the  output).  (Note that the top-level
       loop of the interactive system does it as well, just before a  new  in-
       put.)

   Printing to stdout: using printf
       The  format  module provides a general printing facility "a la" printf.
       In addition to the usual conversion facility provided  by  printf,  you
       can write pretty-printing indications directly inside the format string
       (opening and closing boxes, indicating breaking hints, etc).

       Pretty-printing annotations are introduced by the  @  symbol,  directly
       into the string format. Almost any function of the Format module can be
       called from within a printf format string. For instance

       -" @[ " open a box (open_box 0).  You may precise the type as an  extra
       argument. For instance @[<hov n> is equivalent to open_hovbox n .

       -" @] " close a box ( close_box () ).

       -" @ " output a breakable space ( print_space () ).

       -" @, " output a break hint ( print_cut () ).

       -" @;<n m> " emit a "full" break hint ( print_break n m ).

       -"  @.  " end the pretty-printing, closing all the boxes still opened (
       print_newline () ).

       For instance

       printf "@[<1>%s@ =@ %d@ %s@]@." "Prix TTC" 100 "Euros";; Prix TTC = 100
       Euros - : unit = ()

   A concrete example
       Let  me give a full example: the shortest non trivial example you could
       imagine, that is the lambda calculus :)

       Thus the problem is to pretty-print the values of a concrete data  type
       that  models a language of expressions that defines functions and their
       applications to arguments.

       First, I give the abstract syntax of lambda-terms:

       type lambda = | Lambda of string * lambda | Var of string  |  Apply  of
       lambda * lambda ;;

       I use the format library to print the lambda-terms:

       open  Format;;  let ident = print_string;; let kwd = print_string;; val
       ident : string -> unit = <fun> val kwd : string -> unit = <fun> let rec
       print_exp0  =  function | Var s ->  ident s | lam -> open_hovbox 1; kwd
       "("; print_lambda lam; kwd ")"; close_box () and print_app = function |
       e  ->  open_hovbox  2;  print_other_applications  e;  close_box  () and
       print_other_applications f = match f with | Apply (f, arg) -> print_app
       f;  print_space (); print_exp0 arg | f -> print_exp0 f and print_lambda
       = function | Lambda (s, lam) -> open_hovbox 1; kwd "\\"; ident  s;  kwd
       ".";  print_space(); print_lambda lam; close_box() | e -> print_app e;;
       val print_app : lambda -> unit = <fun> val  print_other_applications  :
       lambda -> unit = <fun> val print_lambda : lambda -> unit = <fun>

   Most general pretty-printing: using fprintf
       We  use the fprintf function to write the most versatile version of the
       pretty-printing functions for lambda-terms.  Now, the functions get  an
       extra  argument,  namely a pretty-printing formatter (the ppf argument)
       where printing will occur. This way the printing routines are more gen-
       eral, since they can print on any formatter defined in the program (ei-
       ther printing to a file, or to stdout ,  to  stderr  ,  or  even  to  a
       string).   Furthermore,  the pretty-printing functions are now composi-
       tional, since they may be used in conjunction with the special %a  con-
       version, that prints a fprintf argument with a user's supplied function
       (these user's supplied functions also have a formatter as  first  argu-
       ment).

       Using  fprintf  ,  the lambda-terms printing routines can be written as
       follows:

       open Format;; let ident ppf s = fprintf ppf "%s" s;; let kwd  ppf  s  =
       fprintf ppf "%s" s;; val ident : Format.formatter -> string -> unit val
       kwd : Format.formatter -> string -> unit let rec pr_exp0 ppf = function
       |  Var s -> fprintf ppf "%a" ident s | lam -> fprintf ppf "@[<1>(%a)@]"
       pr_lambda lam and pr_app ppf = function | e -> fprintf ppf  "@[<2>%a@]"
       pr_other_applications  e and pr_other_applications ppf f = match f with
       | Apply (f, arg) -> fprintf ppf "%a@ %a" pr_app f pr_exp0 arg  |  f  ->
       pr_exp0 ppf f and pr_lambda ppf = function | Lambda (s, lam) -> fprintf
       ppf "@[<1>%a%a%a@ %a@]" kwd "\\" ident s kwd "." pr_lambda lam |  e  ->
       pr_app  ppf  e  ;;  val pr_app : Format.formatter -> lambda -> unit val
       pr_other_applications  :  Format.formatter  ->  lambda  ->   unit   val
       pr_lambda : Format.formatter -> lambda -> unit

       Given those general printing routines, procedures to print to stdout or
       stderr is just a matter of partial application:

       let  print_lambda  =  pr_lambda  std_formatter;;  let  eprint_lambda  =
       pr_lambda  err_formatter;;  val  print_lambda  :  lambda  ->  unit  val
       eprint_lambda : lambda -> unit

OCamldoc                          2023-02-12               Format_tutorial(3o)

Generated by dwww version 1.15 on Sun Jun 23 03:39:45 CEST 2024.