dwww Home | Manual pages | Find package

Genlex(3o)                       OCaml library                      Genlex(3o)

NAME
       Genlex - A generic lexical analyzer.

Module
       Module   Genlex

Documentation
       Module Genlex
        : sig end

       A generic lexical analyzer.

       This  module implements a simple 'standard' lexical analyzer, presented
       as a function from character streams to token  streams.  It  implements
       roughly  the  lexical conventions of OCaml, but is parameterized by the
       set of keywords of your language.

       Example: a lexer suitable for a desk calculator is obtained by
       let lexer = make_lexer ["+"; "-"; "*"; "/"; "let"; "="; "("; ")"]

       The associated parser would be a function from token stream to, for in-
       stance, int , and would have rules such as:

            let rec parse_expr = parser
              | [< n1 = parse_atom; n2 = parse_remainder n1 >] -> n2
            and parse_atom = parser
              | [< 'Int n >] -> n
              | [< 'Kwd "("; n = parse_expr; 'Kwd ")" >] -> n
            and parse_remainder n1 = parser
              | [< 'Kwd "+"; n2 = parse_expr >] -> n1 + n2
              | [< >] -> n1

       One should notice that the use of the parser keyword and associated no-
       tation for streams are only available through camlp4  extensions.  This
       means  that  one has to preprocess its sources e. g. by using the "-pp"
       command-line switch of the compilers.

       type token =
        | Kwd of string
        | Ident of string
        | Int of int
        | Float of float
        | String of string
        | Char of char

       The type of tokens. The lexical classes are: Int and Float for  integer
       and  floating-point  numbers;  String  for string literals, enclosed in
       double quotes; Char for character literals, enclosed in single  quotes;
       Ident for identifiers (either sequences of letters, digits, underscores
       and quotes, or sequences of 'operator characters' such as + , * , etc);
       and Kwd for keywords (either identifiers or single 'special characters'
       such as ( , } , etc).

       val make_lexer : string list -> char Stream.t -> token Stream.t

       Construct the lexer function. The first argument is the  list  of  key-
       words.  An identifier s is returned as Kwd s if s belongs to this list,
       and as Ident s otherwise.  A special character s is returned as  Kwd  s
       if  s  belongs  to  this  list,  and  cause  a lexical error (exception
       Stream.Error with the offending lexeme  as  its  parameter)  otherwise.
       Blanks  and  newlines  are skipped. Comments delimited by (* and *) are
       skipped as well, and can  be  nested.  A  Stream.Failure  exception  is
       raised if end of stream is unexpectedly reached.

OCamldoc                          2023-02-12                        Genlex(3o)

Generated by dwww version 1.15 on Sun Jun 23 04:04:02 CEST 2024.