dwww Home | Manual pages | Find package

Stdlib.Scanf(3o)                 OCaml library                Stdlib.Scanf(3o)

NAME
       Stdlib.Scanf - no description

Module
       Module   Stdlib.Scanf

Documentation
       Module Scanf
        : (module Stdlib__Scanf)

   Introduction
   Functional input with format strings
       The module Scanf provides formatted input functions or scanners.

       The  formatted input functions can read from any kind of input, includ-
       ing strings, files, or anything that can return  characters.  The  more
       general  source  of  characters  is named a formatted input channel (or
       scanning buffer) and has type Scanf.Scanning.in_channel . The more gen-
       eral  formatted  input  function  reads from any scanning buffer and is
       named bscanf .

       Generally speaking, the formatted input functions have 3 arguments:

       -the first argument is a source of characters for the input,

       -the second argument is a format string that specifies  the  values  to
       read,

       -the  third argument is a receiver function that is applied to the val-
       ues read.

       Hence, a typical call to the formatted input function  Scanf.bscanf  is
       bscanf ic fmt f , where:

       - ic is a source of characters (typically a     formatted input channel
       with type Scanf.Scanning.in_channel ),

       - fmt is a format string (the same format  strings  as  those  used  to
       print material with module Printf or Format ),

       - f is a function that has as many arguments as the number of values to
       read in the input according to fmt .

   A simple example
       As suggested above, the expression bscanf ic "%d" f reads a decimal in-
       teger n from the source of characters ic and returns f n .

       For instance,

       -if  we use stdin as the source of characters ( Scanf.Scanning.stdin is
       the predefined formatted input channel that reads from standard input),

       -if we define the receiver f as let f x = x + 1 ,

       then bscanf Scanning.stdin "%d" f reads an integer n from the  standard
       input  and  returns  f  n (that is n + 1 ). Thus, if we evaluate bscanf
       stdin "%d" f , and then enter 41 at the keyboard, the result we get  is
       42 .

   Formatted input as a functional feature
       The  OCaml scanning facility is reminiscent of the corresponding C fea-
       ture.  However, it is also largely different,  simpler,  and  yet  more
       powerful:  the  formatted  input functions are higher-order functionals
       and the parameter passing mechanism is just the regular function appli-
       cation not the variable assignment based mechanism which is typical for
       formatted input in imperative languages; the OCaml format strings  also
       feature  useful  additions to easily define complex tokens; as expected
       within a functional programming language, the formatted input functions
       also  support  polymorphism,  in  particular arbitrary interaction with
       polymorphic user-defined scanners. Furthermore, the OCaml formatted in-
       put facility is fully type-checked at compile time.

   Formatted input channel
       module Scanning : sig end

   Type of formatted input functions
       type ('a, 'b, 'c, 'd) scanner = ('a, Scanning.in_channel, 'b, 'c, 'a ->
       'd, 'd) format6 -> 'c

       The type of formatted input scanners: ('a, 'b, 'c, 'd) scanner  is  the
       type of a formatted input function that reads from some formatted input
       channel according to some format string; more  precisely,  if  scan  is
       some formatted input function, then scan
            ic fmt f applies f to all the arguments specified by format string
       fmt  ,  when  scan  has  read  those  arguments  from  the  Scanf.Scan-
       ning.in_channel formatted input channel ic .

       For  instance, the Scanf.scanf function below has type ('a, 'b, 'c, 'd)
       scanner , since it is  a  formatted  input  function  that  reads  from
       Scanf.Scanning.stdin : scanf fmt f applies f to the arguments specified
       by fmt , reading those arguments from stdin as expected.

       If the format fmt has some %r indications, the corresponding  formatted
       input  functions  must be provided before receiver function f . For in-
       stance, if read_elem is an input function for values of type t  ,  then
       bscanf ic "%r;" read_elem f reads a value v of type t followed by a ';'
       character, and returns f v .

       Since 3.10.0

       exception Scan_failure of string

       When the input can not be read according to the format string  specifi-
       cation,  formatted input functions typically raise exception Scan_fail-
       ure .

   The general formatted input function
       val bscanf : Scanning.in_channel -> ('a, 'b, 'c, 'd) scanner

       bscanf ic fmt r1  ...  rN  f  reads  characters  from  the  Scanf.Scan-
       ning.in_channel  formatted input channel ic and converts them to values
       according to format string fmt .  As a final step, receiver function  f
       is applied to the values read and gives the result of the bscanf call.

       For instance, if f is the function fun s i -> i + 1 , then Scanf.sscanf
       "x= 1" "%s = %i" f returns 2 .

       Arguments r1 to rN are user-defined input functions that read the argu-
       ment  corresponding  to  the  %r  conversions  specified  in the format
       string.

   Format string description
       The format string is a character string which contains three  types  of
       objects:

       -plain  characters, which are simply matched with the characters of the
       input (with a special case for space and line feed, see Scanf.space ),

       -conversion specifications, each of which causes reading and conversion
       of one argument for the function f (see Scanf.conversion ),

       -scanning  indications  to  specify  boundaries of tokens (see scanning
       Scanf.indication ).

   The space character in format strings
       As mentioned above, a plain character in  the  format  string  is  just
       matched  with  the next character of the input; however, two characters
       are special exceptions to this rule: the space character ( ' ' or ASCII
       code 32) and the line feed character ( '\n' or ASCII code 10).  A space
       does not match a single space character, but any amount of 'whitespace'
       in  the input. More precisely, a space inside the format string matches
       any number of tab, space, line feed  and  carriage  return  characters.
       Similarly,  a line feed character in the format string matches either a
       single line feed or a carriage return followed by a line feed.

       Matching any amount of whitespace, a space in the  format  string  also
       matches no amount of whitespace at all; hence, the call bscanf ib
            "Price = %d $" (fun p -> p) succeeds and returns 1 when reading an
       input with various whitespace in it, such as Price = 1 $ , Price  =   1
       $ , or even Price=1$ .

   Conversion specifications in format strings
       Conversion  specifications  consist  in the % character, followed by an
       optional flag, an optional field width, and followed by one or two con-
       version characters.

       The conversion characters and their meanings are:

       - d : reads an optionally signed decimal integer ( 0-9 +).

       -  i  : reads an optionally signed integer (usual input conventions for
       decimal ( 0-9 +), hexadecimal ( 0x[0-9a-f]+ and 0X[0-9A-F]+ ), octal  (
       0o[0-7]+ ), and binary ( 0b[0-1]+ ) notations are understood).

       - u : reads an unsigned decimal integer.

       - x or X : reads an unsigned hexadecimal integer ( [0-9a-fA-F]+ ).

       - o : reads an unsigned octal integer ( [0-7]+ ).

       -  s  : reads a string argument that spreads as much as possible, until
       the following bounding condition holds:

       -a whitespace has been found (see Scanf.space ),

       -a scanning indication (see scanning Scanf.indication )  has  been  en-
       countered,

       -the end-of-input has been reached.

       Hence,  this  conversion always succeeds: it returns an empty string if
       the bounding condition holds when the scan begins.

       - S : reads a delimited string argument (delimiters and special escaped
       characters follow the lexical conventions of OCaml).

       -  c  :  reads  a single character. To test the current input character
       without reading it, specify a null field width, i.e. use  specification
       %0c  .  Raise  Invalid_argument  ,  if the field width specification is
       greater than 1.

       - C : reads a single delimited character (delimiters  and  special  es-
       caped characters follow the lexical conventions of OCaml).

       -  f , e , E , g , G : reads an optionally signed floating-point number
       in decimal notation, in the style dddd.ddd
             e/E+-dd .

       - h , H : reads an optionally signed floating-point number in hexadeci-
       mal notation.

       -  F  :  reads a floating point number according to the lexical conven-
       tions of OCaml (hence the decimal point is mandatory  if  the  exponent
       part is not mentioned).

       - B : reads a boolean argument ( true or false ).

       -  b : reads a boolean argument (for backward compatibility; do not use
       in new programs).

       - ld , li , lu , lx , lX , lo : reads an int32 argument to  the  format
       specified by the second letter for regular integers.

       -  nd , ni , nu , nx , nX , no : reads a nativeint argument to the for-
       mat specified by the second letter for regular integers.

       - Ld , Li , Lu , Lx , LX , Lo : reads an int64 argument to  the  format
       specified by the second letter for regular integers.

       -  [ range ] : reads characters that matches one of the characters men-
       tioned in the range of characters range (or not mentioned in it, if the
       range  starts  with  ^ ). Reads a string that can be empty, if the next
       input character does not match the range. The set of characters from c1
       to  c2  (inclusively)  is  denoted  by c1-c2 .  Hence, %[0-9] returns a
       string representing a decimal number or an empty string if  no  decimal
       digit  is  found;  similarly, %[0-9a-f] returns a string of hexadecimal
       digits.  If a closing bracket appears in a range, it must occur as  the
       first  character  of  the  range  (or just after the ^ in case of range
       negation); hence []] matches a ] character and [^]] matches any charac-
       ter that is not ] .  Use %% and %@ to include a % or a @ in a range.

       -  r  : user-defined reader. Takes the next ri formatted input function
       and applies it to the scanning buffer ib to read the next argument. The
       input  function  ri  must therefore have type Scanning.in_channel -> 'a
       and the argument read has type 'a .

       - { fmt %} : reads a format string argument.  The  format  string  read
       must  have  the  same type as the format string specification fmt . For
       instance, "%{ %i %}" reads any format string that can read a  value  of
       type  int  ;  hence,  if  s is the string "fmt:\"number is %u\"" , then
       Scanf.sscanf s "fmt: %{%i%}" succeeds and  returns  the  format  string
       "number is %u" .

       -  (  fmt %) : scanning sub-format substitution.  Reads a format string
       rf in the input, then goes on scanning with rf instead of scanning with
       fmt  .   The  format  string  rf  must have the same type as the format
       string specification fmt that it replaces.  For instance,  "%(  %i  %)"
       reads  any  format string that can read a value of type int .  The con-
       version returns the format string read rf , and then a value read using
       rf  .  Hence, if s is the string "\"%4d\"1234.00" , then Scanf.sscanf s
       "%(%i%)" (fun fmt i -> fmt, i) evaluates to ("%4d", 1234) .   This  be-
       haviour  is  not mere format substitution, since the conversion returns
       the format string read as additional argument. If you need pure  format
       substitution,  use  special  flag _ to discard the extraneous argument:
       conversion %_( fmt %) reads a format string rf  and  then  behaves  the
       same  as format string rf .  Hence, if s is the string "\"%4d\"1234.00"
       , then Scanf.sscanf s "%_(%i%)" is simply  equivalent  to  Scanf.sscanf
       "1234.00" "%4d" .

       - l : returns the number of lines read so far.

       - n : returns the number of characters read so far.

       - N or L : returns the number of tokens read so far.

       - !  : matches the end of input condition.

       - % : matches one % character in the input.

       - @ : matches one @ character in the input.

       - , : does nothing.

       Following  the  %  character that introduces a conversion, there may be
       the special flag _ : the conversion that follows occurs as  usual,  but
       the  resulting  value is discarded.  For instance, if f is the function
       fun i -> i + 1 , and s is the string "x = 1" , then Scanf.sscanf s "%_s
       = %i" f returns 2 .

       The  field  width is composed of an optional integer literal indicating
       the maximal width of the token to read.  For instance, %6d reads an in-
       teger,  having at most 6 decimal digits; %4f reads a float with at most
       4 characters; and %8[\000-\255] returns the next 8 characters  (or  all
       the  characters  still available, if fewer than 8 characters are avail-
       able in the input).

       Notes:

       -as mentioned above, a %s conversion always succeeds, even if there  is
       nothing to read in the input: in this case, it simply returns "" .

       -in  addition  to the relevant digits, '_' characters may appear inside
       numbers (this is reminiscent to the usual OCaml  lexical  conventions).
       If  stricter scanning is desired, use the range conversion facility in-
       stead of the number conversions.

       -the scanf facility is not intended for heavy duty lexical analysis and
       parsing.  If  it  appears not expressive enough for your needs, several
       alternative exists: regular expressions (module Str ), stream  parsers,
       ocamllex -generated lexers, ocamlyacc -generated parsers.

   Scanning indications in format strings
       Scanning indications appear just after the string conversions %s and %[
       range ] to delimit the end of the token. A scanning indication  is  in-
       troduced  by  a  @  character,  followed by some plain character c . It
       means that the string token should end just before the next matching  c
       (which  is skipped). If no c character is encountered, the string token
       spreads as much as possible. For instance, "%s@\t" reads a string up to
       the next tab character or to the end of input. If a @ character appears
       anywhere else in the format string, it is treated as a plain character.

       Note:

       -As usual in format strings, % and @ characters must be  escaped  using
       %% and %@ ; this rule still holds within range specifications and scan-
       ning indications.  For instance, format "%s@%%" reads a  string  up  to
       the  next % character, and format "%s@%@" reads a string up to the next
       @ .

       -The scanning indications introduce slight differences in the syntax of
       Scanf  format  strings,  compared  to those used for the Printf module.
       However, the scanning indications are similar to those used in the For-
       mat  module;  hence,  when  producing  formatted  text to be scanned by
       Scanf.bscanf , it is wise to use printing  functions  from  the  Format
       module  (or, if you need to use functions from Printf , banish or care-
       fully double check the format strings that contain '@' characters).

   Exceptions during scanning
       Scanners may raise the following exceptions when the  input  cannot  be
       read according to the format string:

       -Raise Scanf.Scan_failure if the input does not match the format.

       -Raise Failure if a conversion to a number is not possible.

       -Raise  End_of_file  if the end of input is encountered while some more
       characters are needed to read the current conversion specification.

       -Raise Invalid_argument if the format string is invalid.

       Note:

       -as a consequence, scanning a  %s  conversion  never  raises  exception
       End_of_file  :  if  the end of input is reached the conversion succeeds
       and simply returns the characters read so far, or "" if none were  ever
       read.

   Specialised formatted input functions
       val sscanf : string -> ('a, 'b, 'c, 'd) scanner

       Same as Scanf.bscanf , but reads from the given string.

       val scanf : ('a, 'b, 'c, 'd) scanner

       Same  as  Scanf.bscanf  , but reads from the predefined formatted input
       channel Scanf.Scanning.stdin that is connected to stdin .

       val kscanf : Scanning.in_channel -> (Scanning.in_channel -> exn ->  'd)
       -> ('a, 'b, 'c, 'd) scanner

       Same  as  Scanf.bscanf  ,  but takes an additional function argument ef
       that is called in case of error: if the scanning process or  some  con-
       version  fails,  the  scanning function aborts and calls the error han-
       dling function ef with the formatted input channel  and  the  exception
       that aborted the scanning process as arguments.

       val  ksscanf : string -> (Scanning.in_channel -> exn -> 'd) -> ('a, 'b,
       'c, 'd) scanner

       Same as Scanf.kscanf but reads from the given string.

       Since 4.02.0

   Reading format strings from input
       val bscanf_format : Scanning.in_channel -> ('a, 'b,  'c,  'd,  'e,  'f)
       format6 -> (('a, 'b, 'c, 'd, 'e, 'f) format6 -> 'g) -> 'g

       bscanf_format  ic  fmt f reads a format string token from the formatted
       input channel ic , according to the given format string fmt ,  and  ap-
       plies f to the resulting format string value.

       Since 3.09.0

       Raises  Scan_failure  if the format string value read does not have the
       same type as fmt .

       val sscanf_format : string -> ('a, 'b, 'c, 'd, 'e, 'f) format6 -> (('a,
       'b, 'c, 'd, 'e, 'f) format6 -> 'g) -> 'g

       Same as Scanf.bscanf_format , but reads from the given string.

       Since 3.09.0

       val  format_from_string : string -> ('a, 'b, 'c, 'd, 'e, 'f) format6 ->
       ('a, 'b, 'c, 'd, 'e, 'f) format6

       format_from_string s fmt converts a string argument to a format string,
       according to the given format string fmt .

       Since 3.10.0

       Raises Scan_failure if s , considered as a format string, does not have
       the same type as fmt .

       val unescaped : string -> string

       unescaped s return a copy of s with escape sequences (according to  the
       lexical  conventions  of OCaml) replaced by their corresponding special
       characters.  More precisely, Scanf.unescaped has  the  following  prop-
       erty: for all string s , Scanf.unescaped (String.escaped s) = s .

       Always  return  a  copy of the argument, even if there is no escape se-
       quence in the argument.

       Since 4.00.0

       Raises Scan_failure if s is not properly escaped (i.e.  s  has  invalid
       escape  sequences or special characters that are not properly escaped).
       For instance, Scanf.unescaped "\"" will fail.

   Deprecated
       val fscanf : in_channel -> ('a, 'b, 'c, 'd) scanner

       Deprecated.

       Scanf.fscanf is error prone and deprecated since 4.03.0.

       This function violates the following invariant of the Scanf module:  To
       preserve  scanning  semantics,  all scanning functions defined in Scanf
       must read from a user defined Scanf.Scanning.in_channel formatted input
       channel.

       If  you need to read from a in_channel input channel ic , simply define
       a Scanf.Scanning.in_channel formatted input channel  as  in  let  ib  =
       Scanning.from_channel ic , then use Scanf.bscanf ib as usual.

       val  kfscanf : in_channel -> (Scanning.in_channel -> exn -> 'd) -> ('a,
       'b, 'c, 'd) scanner

       Deprecated.

       Scanf.kfscanf is error prone and deprecated since 4.03.0.

OCamldoc                          2023-02-12                  Stdlib.Scanf(3o)

Generated by dwww version 1.15 on Sun Jun 23 03:35:35 CEST 2024.