dwww Home | Manual pages | Find package

XS::Parse::Keyword(3pmUser Contributed Perl DocumentatiXS::Parse::Keyword(3pm)

NAME
       "XS::Parse::Keyword" - XS functions to assist in parsing keyword syntax

DESCRIPTION
       This module provides some XS functions to assist in writing syntax
       modules that provide new perl-visible syntax, primarily for authors of
       keyword plugins using the "PL_keyword_plugin" hook mechanism. It is
       unlikely to be of much use to anyone else; and highly unlikely to be
       any use when writing perl code using these. Unless you are writing a
       keyword plugin using XS, this module is not for you.

       This module is also currently experimental, and the design is still
       evolving and subject to change. Later versions may break ABI
       compatibility, requiring changes or at least a rebuild of any module
       that depends on it.

XS FUNCTIONS
   boot_xs_parse_keyword
          void boot_xs_parse_keyword(double ver);

       Call this function from your "BOOT" section in order to initialise the
       module and parsing hooks.

       ver should either be 0 or a decimal number for the module version
       requirement; e.g.

          boot_xs_parse_keyword(0.14);

   register_xs_parse_keyword
          void register_xs_parse_keyword(const char *keyword,
            const struct XSParseKeywordHooks *hooks, void *hookdata);

       This function installs a set of parsing hooks to be associated with the
       given keyword. Such a keyword will then be handled automatically by a
       keyword parser installed by "XS::Parse::Keyword" itself.

PARSE HOOKS
       The "XSParseKeywordHooks" structure provides the following hook stages,
       which are invoked in the given order.

   flags
       The following flags are defined:

       "XPK_FLAG_EXPR"
           The parse or build function is expected to return
           "KEYWORD_PLUGIN_EXPR".

       "XPK_FLAG_STMT"
           The parse or build function is expected to return
           "KEYWORD_PLUGIN_STMT".

           These two flags are largely for the benefit of giving static
           information at registration time to assist static parsing or other
           related tasks to know what kind of grammatical element this keyword
           will produce.

       "XPK_FLAG_AUTOSEMI"
           The syntax forms a complete statement, which should be followed by
           a statement separator semicolon (";"). This semicolon is optional
           at the end of a block.

           The semicolon, if present, will be consumed automatically.

   The "permit" Stage
          const char *permit_hintkey;
          bool (*permit) (pTHX_ void *hookdata);

       Called by the installed keyword parser hook which is used to handle
       keywords registered by "register_xs_parse_keyword".

       As a shortcut for the common case, the "permit_hintkey" may point to a
       string to look up from the hints hash. If the given key name is not
       found in the hints hash then the keyword is not permitted. If the key
       is present then the "permit" function is invoked as normal.

       If not rejected by a hint key that was not found in the hints hash, the
       function part of the stage is called next and should inspect whether
       the keyword is permitted at this time perhaps by inspecting other
       lexical clues, and return true only if the keyword is permitted.

       Both the string and the function are optional. Either or both may be
       present.  If neither is present then the keyword is always permitted -
       which is likely not what you wanted to do.

   The "check" Stage
          void (*check)(pTHX_ void *hookdata);

       Invoked once the keyword has been permitted. If present, this hook
       function can check the surrounding lexical context, state, or other
       information and throw an exception if it is unhappy that the keyword
       should apply in this position.

   The "parse" Stage
       This stage is invoked once the keyword has been checked, and actually
       parses the incoming text into an optree. It is implemented by calling
       the first of the following function pointers which is not NULL. The
       invoked function may optionally build an optree to represent the parsed
       syntax, and place it into the variable addressed by "out". If it does
       not, then a simple "OP_NULL" will be constructed in its place.

       "lex_read_space()" is called both before and after this stage is
       invoked, so in many simple cases the hook function itself does not need
       to bother with it.

          int (*parse)(pTHX_ OP **out, void *hookdata);

       If present, this should consume text from the parser buffer by invoking
       "lex_*" or "parse_*" functions and eventually return a
       "KEYWORD_PLUGIN_*" result value.

       This is the most generic and powerful of the options, but requires the
       most amount of implementation work.

          int (*build)(pTHX_ OP **out, XSParseKeywordPiece *args[], size_t nargs, void *hookdata);

       If "parse" is not present, this is called instead after parsing a
       sequence of arguments, of types given by the pieces field; which should
       be a zero- terminated array of piece types.

       This alternative is somewhat less generic and powerful than providing
       "parse" yourself, but involves much less parsing work and is shorter
       and easier to implement.

          int (*build1)(pTHX_ OP **out, XSParseKeywordPiece *arg0, void *hookdata);

       If neither "parse" nor "build" are present, this is called as a simpler
       variant of "build" when only a single argument is required. It takes
       its type from the "piece1" field instead.

PIECES AND PIECE TYPES
       When using the "build" or "build1" alternatives for the "parse" phase,
       the actual syntax is parsed automatically by this module, according to
       the specification given by the pieces or piece1 field. The result of
       that parsing step is placed into the args or arg0 parameter to the
       invoked function, using a "struct" type consisting of the following
       fields:

          typedef struct
             union {
                OP *op;
                CV *cv;
                SV *sv;
                int i;
                struct {
                   SV *name;
                   SV *value;
                } attr;
                PADOFFSET padix;
                struct XSParseInfixInfo *infix;
             };
             int line;
          } XSParseKeywordPiece;

       Which field of the anonymous union is set depends on the type of the
       piece.  The line field contains the line number of the source file
       where parsing of that piece began.

       Some piece types are "atomic", whose definition is self-contained.
       Others are structural, defined in terms of inner pieces. Together these
       form an entire tree-shaped definition of the syntax that the keyword
       expects to find.

       Atomic types generally provide exactly one argument into the list of
       args (with the exception of literal matches, which do not provide
       anything).  Structural types may provide an initial argument
       themselves, followed by a list of the values of each sub-piece they
       contained inside them. Thus, while the data structure defining the
       syntax shape is a tree, the argument values it parses into is passed as
       a flat array to the "build" function.

       Some structural types need to be able to determine whether or not
       syntax relating some optional part of them is present in the incoming
       source text. In this case, the pieces relating to those optional parts
       must support "probing".  This ability is also noted below.

       The type of each piece should be one of the following macro values.

   XPK_BLOCK
       atomic, can probe, emits op.

          XPK_BLOCK

       A brace-delimited block of code is expected, passed as an optree in the
       op field. This will be parsed as a block within the current function
       scope.

       This can be probed by checking for the presence of an open-brace ("{")
       character.

       Be careful defining grammars with this because an open-brace is also a
       valid character to start a term expression, for example. Given a choice
       between "XPK_BLOCK" and "XPK_TERMEXPR", either of them could try to
       consume such code as

          { 123, 456 }

   XPK_BLOCK_VOIDCTX, XPK_BLOCK_SCALARCTX, XPK_BLOCK_LISTCTX
       Variants of "XPK_BLOCK" which wrap a void, scalar or list-context scope
       around the block.

   XPK_PREFIXED_BLOCK
       structural, emits op.

          XPK_PREFIXED_BLOCK(pieces ...)

       Some pieces are expected, followed by a brace-delimited block of code,
       which is passed as an optree in the op field. The prefix pieces are
       parsed first, and their results are passed before the block itself.

       The entire sequence, including the prefix items, is contained within a
       pair of "block_start()" / "block_end()" calls. This permits the prefix
       pieces to introduce new items into the lexical scope of the block - for
       example by the use of "XPK_LEXVAR_MY".

       A call to "intro_my()" is automatically made at the end of the prefix
       pieces, before the block itself is parsed, ensuring any new lexical
       variables are now visible.

       In addition, the following extra piece types are recognised here:

       XPK_SETUP
              void setup(pTHX_ void *hookdata);

              XPK_SETUP(&setup)

           atomic, emits nothing.

           This piece type runs a function given by pointer. Typically this
           function may be used to introduce new lexical state into the
           parser, or in some other way have some side-effect on the parsing
           context of the block to be parsed.

   XPK_PREFIXED_BLOCK_ENTERLEAVE
       A variant of "XPK_PREFIXED_BLOCK" which additionally wraps the entire
       parsing operation, including the "block_start()", "block_end()" and any
       calls to "XPK_SETUP" functions, within a "ENTER"/"LEAVE" pair.

       This should not make a difference to the standard parser pieces
       provided here, but may be useful behaviour for the code in the setup
       function, especially if it wishes to modify parser state and use the
       savestack to ensure it is restored again when parsing has finished.

   XPK_ANONSUB
       atomic, emits cv.

       A brace-delimited block of code is expected, and assembled into the
       body of a new anonymous subroutine. This will be passed as a protosub
       CV in the cv field.

   XPK_STAGED_ANONSUB
          XPK_STAGED_ANONSUB(stages ...)

       structural, emits cv.

       A variant of "XPK_ANONSUB" which accepts additional function pointers
       to be invoked at various points during parsing and compilation. These
       can be used to interrupt the normal parsing in a manner similar to
       XS::Parse::Sublike, though currently somewhat less flexibly.

       The stages list may contain elements of the following types. Not every
       stage must be present, but any that are present must be in the
       following order. Multiple copies of each stage are permitted; they are
       invoked in the written order, with parser code happening inbetween.

       XPK_ANONSUB_PREPARE
              XPK_ANONSUB_PREPARE(&callback)

           atomic, emits nothing.

           Invokes the callback before "start_subparse()".

       XPK_ANONSUB_START
              XPK_ANONSUB_START(&callback)

           atomic, emits nothing.

           Invokes the callback after "block_start()" but before parsing the
           actual block contents.

       XPK_ANONSUB_END
              OP *op_wrapper_callback(pTHX_ OP *o, void *hookdata);

              XPK_ANONSUB_END(&op_wrapper_callback)

           atomic, emits nothing.

           Invokes the callback after parsing the block contents but before
           calling "block_end()". The callback may modify the optree if
           required and return a new one.

       XPK_ANONSUB_WRAP
              XPK_ANONSUB_WRAP(&op_wrapper_callback)

           atomic, emits nothing.

           Invokes the callback after "block_end()" but before passing the
           optree to "newATTRSUB()". The callback may modify the optree if
           required and return a new one.

   XPK_ARITHEXPR
       atomic, emits op.

          XPK_ARITHEXPR

       An arithmetic expression is expected, parsed using "parse_arithexpr()",
       and passed as an optree in the op field.

   XPK_ARITHEXPR_VOIDCTX, XPK_ARITHEXPR_SCALARCTX
       Variants of "XPK_ARITHEXPR" which puts the expression in void or scalar
       context.

   XPK_TERMEXPR
       atomic, emits op.

          XPK_TERMEXPR

       A term expression is expected, parsed using "parse_termexpr()", and
       passed as an optree in the op field.

   XPK_TERMEXPR_VOIDCTX, XPK_TERMEXPR_SCALARCTX
       Variants of "XPK_TERMEXPR" which puts the expression in void or scalar
       context.

   XPK_PREFIXED_TERMEXPR_ENTERLEAVE
          XPK_PREFIXED_TERMEXPR_ENTERLEAVE(pieces ...)

       A variant of "XPK_TERMEXPR" which expects a sequence pieces first
       before it parses a term expression, similar to how
       "XPK_PREFIXED_BLOCK_ENTERLEAVE" works. The entire operation is wrapped
       in an "ENTER"/"LEAVE" pair.

       This is intended just for use of "XPK_SETUP" pieces as prefixes. Any
       other pieces which actually parse real input are likely to cause
       overly-complex, subtle, or outright ambiguous grammars, and should be
       avoided.

   XPK_LISTEXPR
       atomic, emits op.

          XPK_LISTEXPR

       A list expression is expected, parsed using "parse_listexpr()", and
       passed as an optree in the op field.

   XPK_LISTEXPR_LISTCTX
       Variant of "XPK_LISTEXPR" which puts the expression in list context.

   XPK_IDENT, XPK_IDENT_OPT
       atomic, can probe, emits sv.

       A bareword identifier name is expected, and passed as an SV containing
       a PV in the sv field. An identifier is not permitted to contain a
       double colon ("::").

       The "_OPT"-suffixed version is optional; if no identifier is found then
       sv is set to "NULL".

   XPK_PACKAGENAME, XPK_PACKAGENAME_OPT
       atomic, can probe, emits sv.

       A bareword package name is expected, and passed as an SV containing a
       PV in the sv field. A package name is similar to an identifier, except
       it permits double colons in the middle.

       The "_OPT"-suffixed version is optional; if no package name is found
       then sv is set to "NULL".

   XPK_LEXVARNAME
       atomic, emits sv.

          XPK_LEXVARNAME(kind)

       A lexical variable name is expected, and passed as an SV containing a
       PV in the sv field. The "kind" argument specifies what kinds of
       variable are permitted, and should be a bitmask of one or more bits
       from "XPK_LEXVAR_SCALAR", "XPK_LEXVAR_ARRAY" and "XPK_LEXVAR_HASH". A
       convenient shortcut "XPK_LEXVAR_ANY" permits all three.

   XPK_ATTRIBUTES
       atomic, emits i followed by more args.

       A list of ":"-prefixed attributes is expected, in the same format as
       sub or variable attributes. An optional leading ":" indicates the
       presence of attributes, then one or more of them are parsed. Attributes
       may be optionally separated by additional ":"s, but this is not
       required.

       Each attribute is expected to be an identifier name, followed by an
       optional value wrapped in parentheses. Whitespace is NOT permitted
       between the name and value, as per standard Perl parsing rules.

          :attrname
          :attrname(value)

       The i field indicates how many attributes were found. That number of
       additional arguments are then passed, each containing two SVs in the
       attr.name and attr.value fields. This number may be zero.

       It is not an error for there to be no attributes present, or for the
       optional colon to be missing. In this case i will be set to zero.

   XPK_VSTRING, XPK_VSTRING_OPT
       atomic, can probe, emits sv.

       A version string is expected, of the form "v1.234" including the
       leading "v" character. It is passed as a version SV object in the sv
       field.

       The "_OPT"-suffixed version is optional; if no version string is found
       then sv is set to "NULL".

   XPK_LEXVAR
       atomic, emits padix.

          XPK_LEXVAR(kind)

       A lexical variable name is expected and looked up from the current pad.
       The resulting pad index is passed in the padix field. No error happens
       if the variable is not found; the value "NOT_IN_PAD" is passed instead.

       The "kind" argument specifies what kinds of variable are permitted, as
       per "XPK_LEXVARNAME".

   XPK_LEXVAR_MY
       atomic, emits padix.

          XPK_LEXVAR_MY(kind)

       A lexical variable name is expected, added to the current pad as if
       specified in a "my" expression, and passed as the pad index in the
       padix field.

       The "kind" argument specifies what kinds of variable are permitted, as
       per "XPK_LEXVARNAME".

   XPK_COMMA, XPK_COLON, XPK_EQUALS
       atomic, can probe, emits nothing.

       A literal character (",", ":" or "=") is expected. No argument value is
       passed.

   XPK_AUTOSEMI
       atomic, emits nothing.

       A literal semicolon (";") as a statement terminator is optionally
       expected.  If the next token is a closing brace to indicate the end of
       a block, then a semicolon is not required. If anything else is
       encountered an error will be raised.

       This piece type is the same as specifying the "XPK_FLAG_AUTOSEMI". It
       is useful to put at the end of a sequence that forms part of a choice
       of syntax, where some forms indicate a statement ending in a semicolon,
       whereas others may end in a full block that does not need one.

   XPK_INFIX_*
       atomic, can probe, emits infix.

       An infix operator as recognised by XS::Parse::Infix. The returned
       pointer points to a structure allocated by "XS::Parse::Infix"
       describing the operator.

       Various versions of the macro are provided, each using a different
       selection filter to choose certain available infix operators:

          XPK_INFIX_RELATION         # any relational operator
          XPK_INFIX_EQUALITY         # an equality operator like `==` or `eq`
          XPK_INFIX_MATCH_NOSMART    # any sort of "match"-like operator, except smartmatch
          XPK_INFIX_MATCH_SMART      # XPK_INFIX_MATCH_NOSMART plus smartmatch

   XPK_LITERAL
       atomic, can probe, emits nothing.

          XPK_LITERAL("literal")

       A literal string match is expected. No argument value is passed.

       This form should generally be avoided if at all possible, because it is
       very easy to abuse to make syntaxes which confuse humans and code tools
       alike.  Generally it is best reserved just for the first component of a
       "XPK_OPTIONAL" or "XPK_REPEATED" sequence, to provide a "secondary
       keyword" that such a repeated item can look out for.

   XPK_KEYWORD
       atomic, can probe, emits nothing.

          XPK_KEYWORD("keyword")

       A literal string match is expected. No argument value is passed.

       This is similar to "XPK_LITERAL" except that it additionally checks
       that the following character is not an identifier character. This
       ensures that the expected keyword-like behaviour is preserved. For
       example, given the input "keyword", the piece "XPK_LITERAL("key")"
       would match it, whereas "XPK_KEYWORD("key")" would not because of the
       subsequent "w" character.

   XPK_SEQUENCE
       structural, might support probe, emits nothing.

          XPK_SEQUENCE(pieces ...)

       A structural type which contains a number of pieces. This is normally
       equivalent to simply placing the pieces in sequence inside their own
       container, but it is useful inside "XPK_CHOICE" or "XPK_TAGGEDCHOICE".

       An "XPK_SEQUENCE" supports probe if its first contained piece does;
       i.e.  is transparent to probing.

   XPK_OPTIONAL
       structural, emits i.

          XPK_OPTIONAL(pieces ...)

       A structural type which may expects to find its contained pieces, or is
       happy not to. This will pass an argument whose i field contains either
       1 or 0, depending whether the contents were found. The first piece type
       within must support probe.

   XPK_REPEATED
       structural, emits i.

          XPK_REPEATED(pieces ...)

       A structural type which expects to find zero or more repeats of its
       contained pieces. This will pass an argument whose i field contains the
       count of the number of repeats it found. The first piece type within
       must support probe.

   XPK_CHOICE
       structural, can probe, emits i.

          XPK_CHOICE(options ...)

       A structural type which expects to find one of a number of alternative
       options. An ordered list of types is provided, all of which must
       support probe. This will pass an argument whose i field gives the index
       of the first choice that was accepted. The first option takes the value
       0.

       As each of the options is interpreted as an alternative, not a
       sequence, you should use "XPK_SEQUENCE" if a sequence of multiple items
       should be considered as a single alternative.

       It is not an error if no choice matches. At that point, the i field
       will be set to -1.

       If you require a failure message in this case, set the final choice to
       be of type "XPK_FAILURE". This will cause an error message to be
       printed instead.

          XPK_FAILURE("message string")

   XPK_TAGGEDCHOICE
       structural, can probe, emits i.

          XPK_TAGGEDCHOICE(choice, tag, ...)

       A structural type similar to "XPK_CHOICE", except that each choice type
       is followed by an element of type "XPK_TAG" which gives an integer. It
       is that integer value, rather than the positional index of the choice
       within the list, which is passed in the i field.

          XPK_TAG(value)

       As each of the options is interpreted as an alternative, not a
       sequence, you should use "XPK_SEQUENCE" if a sequence of multiple items
       should be considered as a single alternative.

   XPK_COMMALIST
       structural, might support probe, emits i.

          XPK_COMMALIST(pieces ...)

       A structural type which expects to find one or more repeats of its
       contained pieces, separated by literal comma (",") characters. This is
       somewhat similar to "XPK_REPEATED", except that it needs at least one
       copy, needs commas between its items, but does not require that the
       first contained piece support probe (the comma itself is sufficient to
       indicate a repeat).

       An "XPK_COMMALIST" supports probe if its first contained piece does;
       i.e.  is transparent to probing.

   XPK_PARENSCOPE
       structural, can probe, emits nothing.

          XPK_PARENSCOPE(pieces ...)

       A structural type which expects to find a sequence of pieces, all
       contained in parentheses as "( ... )". This will pass no extra
       arguments.

   XPK_ARGSCOPE
       structural, emits nothing.

          XPK_ARGSCOPE(pieces ...)

       A structural type similar to "XPK_PARENSCOPE", except that the
       parentheses themselves are optional; much like Perl's parsing of calls
       to known functions.

       If parentheses are encountered in the input, they will be consumed by
       this piece and it will behave identically to "XPK_PARENSCOPE". If there
       is no open parenthesis, this piece will behave like "XPK_SEQUENCE" and
       consume all the pieces inside it, without expecting a closing
       parenthesis.

   XPK_BRACKETSCOPE
       structural, can probe, emits nothing.

          XPK_BRACKETSCOPE(pieces ...)

       A structural type which expects to find a sequence of pieces, all
       contained in square brackets as "[ ... ]". This will pass no extra
       arguments.

   XPK_BRACESCOPE
       structural, can probe, emits nothing.

          XPK_BRACESCOPE(pieces ...)

       A structural type which expects to find a sequence of pieces, all
       contained in braces as "{ ... }". This will pass no extra arguments.

       Note that this is not necessary to use with "XPK_BLOCK" or
       "XPK_ANONSUB"; those will already consume a set of braces. This is
       intended for special constrained syntax that should not just accept an
       arbitrary block.

   XPK_CHEVRONSCOPE
       structural, can probe, emits nothing.

          XPK_CHEVRONSCOPE(pieces ...)

       A structural type which expects to find a sequence of pieces, all
       contained in angle brackets as "< ... >". This will pass no extra
       arguments.

       Remember that expressions like "a > b" are valid term expressions, so
       the contents of this scope shouldn't allow arbitrary expressions or the
       closing bracket will be ambiguous.

   XPK_PARENSCOPE_OPT, XPK_BRACKETSCOPE_OPT, XPK_BRACESCOPE_OPT,
       XPK_CHEVRONSCOPE_OPT
       structural, can probe, emits i.

          XPK_PARENSCOPE_OPT(pieces ...)
          XPK_BRACKETSCOPE_OPT(pieces ...)
          XPK_BRACESCOPE_OPT(pieces ...)
          XPK_CHEVERONSCOPE_OPT(pieces ...)

       Each of the four "XPK_...SCOPE" macros above has an optional variant,
       whose name is suffixed by "_OPT". These pass an argument whose i field
       is either true or false, indicating whether the scope was found,
       followed by the values from the scope itself.

       This is a convenient shortcut to nesting the scope within a
       "XPK_OPTIONAL" macro.

   XPK_..._pieces
          XPK_SEQUENCE_pieces(ptr)
          XPK_OPTIONAL_pieces(ptr)
          ...

       For each of the "XPK_..." macros that takes a variable-length list of
       pieces, there is a variant whose name ends with "..._pieces", taking a
       single pointer argument directly. This must point at a "const
       XSParseKeywordPieceType []" array whose final element is the zero
       element.

       Normally hand-written C code of a fixed grammar would be unlikely to
       use these forms, but they may be useful in dynamically-generated cases.

AUTHOR
       Paul Evans <leonerd@leonerd.org.uk>

perl v5.36.0                      2023-02-20           XS::Parse::Keyword(3pm)

Generated by dwww version 1.15 on Fri Jun 21 07:37:31 CEST 2024.