/usr/share/doc/ada-reference-manual-2012/aarm2012.txt/aa-02.TXT

dwww Home | Show directory contents | Find package

                            2   Lexical Elements


1/3 {AI05-0299-1} [The text of a program consists of the texts of one or more
compilations. The text of a compilation is a sequence of lexical elements,
each composed of characters; the rules of composition are given in this
clause. Pragmas, which provide certain information for the compiler, are also
described in this clause.]


2.1 Character Set


1/3 {AI95-00285-01} {AI95-00395-01} {AI05-0266-1} The character repertoire for
the text of an Ada program consists of the entire coding space described by
the ISO/IEC 10646:2011 Universal Multiple-Octet Coded Character Set. This
coding space is organized in planes, each plane comprising 65536 characters.

1.a/2       This paragraph was deleted.{AI95-00285-01}

1.b/2       This paragraph was deleted.{AI95-00285-01}

1.c/3       Discussion: {AI95-00285-01} {AI05-0266-1} It is our intent to
            follow the terminology of ISO/IEC 10646:2011 where appropriate,
            and to remain compatible with the character classifications
            defined in A.3, "Character Handling".


                                   Syntax

        Paragraphs 2 and 3 were deleted.

3.1/3   {AI95-00285-01} {AI95-00395-01} {AI05-0266-1} A character is defined
        by this International Standard for each cell in the coding space
        described by ISO/IEC 10646:2011, regardless of whether or not ISO/IEC
        10646:2011 allocates a character to that cell.


                              Static Semantics

4/3 {AI95-00285-01} {AI95-00395-01} {AI05-0079-1} {AI05-0262-1} {AI05-0266-1}
The coded representation for characters is implementation defined [(it need
not be a representation defined within ISO/IEC 10646:2011)]. A character whose
relative code point in its plane is 16#FFFE# or 16#FFFF# is not allowed
anywhere in the text of a program. The only characters allowed outside of
comments are those in categories other_format, format_effector, and
graphic_character.

4.a         Implementation defined: The coded representation for the text of
            an Ada program.

4.b/2       Ramification: {AI95-00285-01} Note that this rule doesn't really
            have much force, since the implementation can represent characters
            in the source in any way it sees fit. For example, an
            implementation could simply define that what seems to be an
            other_private_use character is actually a representation of the
            space character.

4.1/3 {AI95-00285-01} {AI05-0266-1} {AI05-0299-1} The semantics of an Ada
program whose text is not in Normalization Form KC (as defined by Clause 21 of
ISO/IEC 10646:2011) is implementation defined.

4.c/2       Implementation defined: The semantics of an Ada program whose text
            is not in Normalization Form KC.

5/3 {AI95-00285-01} {AI05-0266-1} {AI05-0299-1} The description of the
language definition in this International Standard uses the character
properties General Category, Simple Uppercase Mapping, Uppercase Mapping, and
Special Case Condition of the documents referenced by the note in Clause 1 of
ISO/IEC 10646:2011. The actual set of graphic symbols used by an
implementation for the visual representation of the text of an Ada program is
not specified.

6/3 {AI95-00285-01} {AI05-0266-1} Characters are categorized as follows:

6.a/3       Discussion: {AI05-0005-1} {AI05-0262-1} {AI05-0266-1} Our
            character classification considers that the cells not allocated in
            ISO/IEC 10646:2011 are graphic characters, except for those whose
            relative code point in their plane is 16#FFFE# or 16#FFFF#. This
            seems to provide the best compatibility with future versions of
            ISO/IEC 10646, as future characters can already be used in Ada
            character and string literals.

7/2             This paragraph was deleted.{AI95-00285-01}

8/2 {AI95-00285-01} letter_uppercase
                Any character whose General Category is defined to be "
                Letter, Uppercase".

9/2 {AI95-00285-01} letter_lowercase
                Any character whose General Category is defined to be "
                Letter, Lowercase".

9.a/1       This paragraph was deleted.{8652/0001} {AI95-00124-01}

9.1/2 {AI95-00285-01} letter_titlecase
                Any character whose General Category is defined to be "
                Letter, Titlecase".

9.2/2 {AI95-00285-01} letter_modifier
                Any character whose General Category is defined to be "
                Letter, Modifier".

9.3/2 {AI95-00285-01} letter_other
                Any character whose General Category is defined to be "
                Letter, Other".

9.4/2 {AI95-00285-01} mark_non_spacing
                Any character whose General Category is defined to be "Mark,
                Non-Spacing".

9.5/2 {AI95-00285-01} mark_spacing_combining
                Any character whose General Category is defined to be "Mark,
                Spacing Combining".

10/2 {AI95-00285-01} number_decimal
                Any character whose General Category is defined to be "
                Number, Decimal".

10.1/2 {AI95-00285-01} number_letter
                Any character whose General Category is defined to be "
                Number, Letter".

10.2/2 {AI95-00285-01} punctuation_connector
                Any character whose General Category is defined to be "
                Punctuation, Connector".

10.3/2 {AI95-00285-01} other_format
                Any character whose General Category is defined to be "Other,
                Format".

11/2 {AI95-00285-01} separator_space
                Any character whose General Category is defined to be "
                Separator, Space".

12/2 {AI95-00285-01} separator_line
                Any character whose General Category is defined to be "
                Separator, Line".

12.1/2 {AI95-00285-01} separator_paragraph
                Any character whose General Category is defined to be "
                Separator, Paragraph".

13/3 {AI95-00285-01} {AI05-0262-1} format_effector
                The characters whose code points are 16#09# (CHARACTER
                TABULATION), 16#0A# (LINE FEED), 16#0B# (LINE TABULATION),
                16#0C# (FORM FEED), 16#0D# (CARRIAGE RETURN), 16#85# (NEXT
                LINE), and the characters in categories separator_line and
                separator_paragraph.

13.a/2      Discussion: ISO/IEC 10646:2003 does not define the names of
            control characters, but rather refers to the names defined by
            ISO/IEC 6429:1992. These are the names that we use here.

13.1/2 {AI95-00285-01} other_control
                Any character whose General Category is defined to be "Other,
                Control", and which is not defined to be a format_effector.

13.2/2 {AI95-00285-01} other_private_use
                Any character whose General Category is defined to be "Other,
                Private Use".

13.3/2 {AI95-00285-01} other_surrogate
                Any character whose General Category is defined to be "Other,
                Surrogate".

14/3 {AI95-00285-01} {AI95-00395-01} {AI05-0262-1} graphic_character
                Any character that is not in the categories other_control,
                other_private_use, other_surrogate, format_effector, and whose
                relative code point in its plane is neither 16#FFFE# nor
                16#FFFF#.

14.a/2      This paragraph was deleted.

14.b/2      Discussion: {AI95-00285-01} We considered basing the definition of
            lexical elements on Annex A of ISO/IEC TR 10176 (4th edition),
            which lists the characters which should be supported in
            identifiers for all programming languages, but we finally decided
            against this option. Note that it is not our intent to diverge
            from ISO/IEC TR 10176, except to the extent that ISO/IEC TR 10176
            itself diverges from ISO/IEC 10646:2003 (which is the case at the
            time of this writing [January 2005]).

14.c/2      More precisely, we intend to align strictly with ISO/IEC
            10646:2003. It must be noted that ISO/IEC TR 10176 is a Technical
            Report while ISO/IEC 10646:2003 is a Standard. If one has to make
            a choice, one should conform with the Standard rather than with
            the Technical Report. And, it turns out that one must make a
            choice because there are important differences between the two:

14.d/2        * ISO/IEC TR 10176 is still based on ISO/IEC 10646:2000 while
                ISO/IEC 10646:2003 has already been published for a year. We
                cannot afford to delay the adoption of our amendment until
                ISO/IEC TR 10176 has been revised.

14.e/2        * There are considerable differences between the two editions of
                ISO/IEC 10646, notably in supporting characters beyond the BMP
                (this might be significant for some languages, e.g. Korean).

14.f/2        * ISO/IEC TR 10176 does not define case conversion tables, which
                are essential for a case-insensitive language like Ada. To get
                case conversion tables, we would have to reference either
                ISO/IEC 10646:2003 or Unicode, or we would have to invent our
                own.

14.g/2      For the purpose of defining the lexical elements of the language,
            we need character properties like categorization, as well as case
            conversion tables. These are mentioned in ISO/IEC 10646:2003 as
            useful for implementations, with a reference to Unicode.
            Machine-readable tables are available on the web at URLs:

14.h/2          http://www.unicode.org/Public/4.0-Update/UnicodeData-4.0.0.txt
                http://www.unicode.org/Public/4.0-Update/CaseFolding-4.0.0.txt

14.i/2      with an explanatory document found at URL:

14.j/2          http://www.unicode.org/Public/4.0-Update/UCD-4.0.0.html

14.k/2      The actual text of the standard only makes specific references to
            the corresponding clauses of ISO/IEC 10646:2003, not to Unicode.

15/3 {AI95-00285-01} {AI05-0266-1} The following names are used when referring
to certain characters (the first name is that given in ISO/IEC 10646:2011):

15.a/3      Discussion: {AI95-00285-01} {AI05-0266-1} This table serves to
            show the correspondence between ISO/IEC 10646:2011 names and the
            graphic symbols (glyphs) used in this International Standard.
            These are the characters that play a special role in the syntax of
            Ada.

  graphic symbol

         "
         #
         &
         '
         (
         )
         *
         +
         ,
         -
         .


name

quotation mark
number sign
ampersand
apostrophe, tick
left parenthesis
right parenthesis
asterisk, multiply
plus sign
comma
hyphen-minus, minus
full stop, dot, point


  graphic symbol

         :
         ;
         <
         =
         >
         _
         |
         /
         !
         %



name

colon
semicolon
less-than sign
equals sign
greater-than sign
low line, underline
vertical line
solidus, divide
exclamation point
percent sign



                         Implementation Requirements

16/3 {AI05-0286-1} An Ada implementation shall accept Ada source code in UTF-8
encoding, with or without a BOM (see A.4.11), where every character is
represented by its code point. The character pair CARRIAGE RETURN/LINE FEED
(code points 16#0D# 16#0A#) signifies a single end of line (see 2.2); every
other occurrence of a format_effector other than the character whose code
point position is 16#09# (CHARACTER TABULATION) also signifies a single end of
line.

16.a/3      Reason: {AI05-0079-1} {AI05-0286-1} This is simply requiring that
            an Ada implementation be able to directly process the ACATS, which
            is provided in the described format. Note that files that only
            contain characters with code points in the first 128 (which is the
            majority of the ACATS) are represented in the same way in both
            UTF-8 and in "plain" string format. The ACATS includes a BOM in
            files that have any characters with code points greater than 127.
            Note that the BOM contains characters not legal in Ada source
            code, so an implementation can use that to automatically
            distinguish between files formatted as plain Latin-1 strings and
            UTF-8 with BOM.

16.b/3      We allow line endings to be both represented as the pair CR LF (as
            in Windows and the ACATS), and as single format_effector
            characters (usually LF, as in Linux), in order that files created
            by standard tools on most operating systems will meet the standard
            format. We specify how many line endings each represent so that
            compilers use the same line numbering for standard source files.

16.c/3      This requirement increases portability by having a format that is
            accepted by all Ada compilers. Note that implementations can
            support other source representations, including structured
            representations like a parse tree.


                         Implementation Permissions

17/3 {AI95-00285-01} {AI05-0266-1} The categories defined above, as well as
case mapping and folding, may be based on an implementation-defined version of
ISO/IEC 10646 (2003 edition or later).

17.b/3      Ramification: The exact categories, case mapping, and case folding
            chosen affects identifiers, the result of '[[Wide_]Wide_]Image,
            and packages Wide_Characters.Handling and
            Wide_Wide_Characters.Handling.

17.c/3      Discussion: This permission allows implementations to upgrade to
            using a newer character set standard whenever that makes sense,
            rather than having to wait for the next Ada Standard. But the
            character set standard used cannot be older than ISO/IEC
            10646:2003 (which is essentially similar to Unicode 4.0).

        NOTES

18/2    1  {AI95-00285-01} The characters in categories other_control,
        other_private_use, and other_surrogate are only allowed in comments.

19.a/3      This paragraph was deleted.{AI05-0286-1}


                            Extensions to Ada 83

19.b        Ada 95 allows 8-bit and 16-bit characters, as well as
            implementation-specified character sets.


                         Wording Changes from Ada 83

19.c/3      {AI95-00285-01} {AI05-0299-1} The syntax rules in this subclause
            are modified to remove the emphasis on basic characters vs.
            others. (In this day and age, there is no need to point out that
            you can write programs without using (for example) lower case
            letters.) In particular, character (representing all characters
            usable outside comments) is added, and basic_graphic_character,
            other_special_character, and basic_character are removed.
            Special_character is expanded to include Ada 83's
            other_special_character, as well as new 8-bit characters not
            present in Ada 83. Ada 2005 removes special_character altogether;
            we want to stick to ISO/IEC 10646:2003 character classifications.
            Note that the term "basic letter" is used in A.3, "
            Character Handling" to refer to letters without diacritical marks.

19.d/2      {AI95-00285-01} Character names now come from ISO/IEC 10646:2003.

19.e/2      This paragraph was deleted.{AI95-00285-01}


                            Extensions to Ada 95

19.f/2      {AI95-00285-01} {AI95-00395-01} Program text can use most
            characters defined by ISO-10646:2003. This subclause has been
            rewritten to use the categories defined in that Standard. This
            should ease programming in languages other than English.


                        Inconsistencies With Ada 2005

19.g/3      {AI05-0299-1} {AI05-0266-1} An implementation is allowed (but not
            required) to use a newer character set standard to determine the
            categories, case mapping, and case folding. Doing so will change
            the results of attributes '[[Wide_]Wide_]Image and the packages
            [Wide_]Wide_Characters.Handling in the case of a few rarely used
            characters. (This also could make some identifiers illegal, for
            characters that are no longer classified as letters.) This is
            unlikely to be a problem in practice. Moreover, truly portable Ada
            2012 programs should avoid using in these contexts any characters
            that would have different classifications in any character set
            standards issued since 10646:2003 (since the compiler can use any
            such standard as the basis for its classifications).


                        Wording Changes from Ada 2005

19.h/3      {AI05-0079-1} Correction: Clarified that only characters in the
            categories defined here are allowed in the source of an Ada
            program. This was clear in Ada 95, but Amendment 1 dropped the
            wording instead of correcting it.

19.i/3      {AI05-0286-1} A standard source representation is defined that all
            compilers are expected to process. Since this is the same format
            as the ACATS, it seems unlikely that there are any implementations
            that don't meet this requirement. Moreover, other representations
            are still permitted, and the "impossible or impractical" loophole
            (see 1.1.3) can be invoked for any implementations that cannot
            directly process the ACATS.


2.2 Lexical Elements, Separators, and Delimiters



                              Static Semantics

1   The text of a program consists of the texts of one or more compilations.
The text of each compilation is a sequence of separate lexical elements. Each
lexical element is formed from a sequence of characters, and is either a
delimiter, an identifier, a reserved word, a numeric_literal, a
character_literal, a string_literal, or a comment. The meaning of a program
depends only on the particular sequences of lexical elements that form its
compilations, excluding comments.

2/3 {AI95-00285-01} {AI05-0262-1} The text of a compilation is divided into
lines. In general, the representation for an end of line is implementation
defined. However, a sequence of one or more format_effectors other than the
character whose code point is 16#09# (CHARACTER TABULATION) signifies at least
one end of line.

2.a         Implementation defined: The representation for an end of line.

3/2 {AI95-00285-01} [In some cases an explicit separator is required to
separate adjacent lexical elements.] A separator is any of a separator_space,
a format_effector, or the end of a line, as follows:

4/2   * {AI95-00285-01} A separator_space is a separator except within a
        comment, a string_literal, or a character_literal.

5/3   * {AI95-00285-01} {AI05-0262-1} The character whose code point is 16#09#
        (CHARACTER TABULATION) is a separator except within a comment.

6     * The end of a line is always a separator.

7   One or more separators are allowed between any two adjacent lexical
elements, before the first of each compilation, or after the last. At least
one separator is required between an identifier, a reserved word, or a
numeric_literal and an adjacent identifier, reserved word, or
numeric_literal.

7.1/3 {AI05-0079-1} One or more other_format characters are allowed anywhere
that a separator is[; any such characters have no effect on the meaning of an
Ada program].

8/2 {AI95-00285-01} A delimiter is either one of the following characters:

9       &    '    (    )    *    +    ,    -    .    /    :    ;    <    =    >    |

10  or one of the following compound delimiters each composed of two adjacent
special characters

11      =>    ..    **    :=    /=    >=    <=    <<    >>    <>

12  Each of the special characters listed for single character delimiters is a
single delimiter except if this character is used as a character of a compound
delimiter, or as a character of a comment, string_literal, character_literal,
or numeric_literal.

13  The following names are used when referring to compound delimiters:

          delimiter                  name

          =>                         arrow
          ..                         double dot
          **                         double star, exponentiate
          :=                         assignment (pronounced: "becomes")
          /=                         inequality (pronounced: "not equal")
          >=                         greater than or equal
          <=                         less than or equal
          <<                         left label bracket
          >>                         right label bracket
          <>                         box

                         Implementation Requirements

14  An implementation shall support lines of at least 200 characters in
length, not counting any characters used to signify the end of a line. An
implementation shall support lexical elements of at least 200 characters in
length. The maximum supported line length and lexical element length are
implementation defined.

14.a        Implementation defined: Maximum supported line length and lexical
            element length.

14.b        Discussion: From URG recommendation.


                         Wording Changes from Ada 95

14.c/3      {AI95-00285-01} {AI05-0299-1} The wording was updated to use the
            new character categories defined in the preceding subclause.


                           Extensions to Ada 2005

14.d/3      {AI05-0079-1} Correction: Clarified that other_format characters
            are allowed anywhere that separators are allowed. This was
            intended in Ada 2005, but didn't actually make it into the
            wording.


2.3 Identifiers


1   Identifiers are used as names.


                                   Syntax

2/2     {AI95-00285-01} {AI95-00395-01} identifier ::= 
           identifier_start {identifier_start | identifier_extend}

3/2     {AI95-00285-01} {AI95-00395-01} identifier_start ::= 
             letter_uppercase
           | letter_lowercase
           | letter_titlecase
           | letter_modifier
           | letter_other
           | number_letter

3.1/3   {AI95-00285-01} {AI95-00395-01} {AI05-0091-1} identifier_extend
         ::= 
             mark_non_spacing
           | mark_spacing_combining
           | number_decimal
           | punctuation_connector

4/3     {AI95-00395-01} {AI05-0091-1} An identifier shall not contain two
        consecutive characters in category punctuation_connector, or end with
        a character in that category.

4.a/3       Reason: This rule was stated in the syntax in Ada 95, but that has
            gotten too complex in Ada 2005.


                              Static Semantics

5/3 {AI95-00285-01} {AI05-0091-1} {AI05-0227-1} {AI05-0266-1} {AI05-0299-1}
Two identifiers are considered the same if they consist of the same sequence
of characters after applying locale-independent simple case folding, as
defined by documents referenced in the note in Clause 1 of ISO/IEC 10646:2011.

5.a/3       Discussion: {AI05-0227-1} Simple case folding is a mapping to
            lower case, so this is matching the defining (lower case) version
            of a reserved word. We could have mentioned case folding of the
            reserved words, but as that is an identity function, it would have
            no effect.

5.a.1/3     {AI05-0227-1} The "documents referenced" means Unicode. Note that
            simple case folding is supposed to be compatible between Unicode
            versions, so the Unicode version used doesn't matter.

5.3/3 {AI95-00395-01} {AI05-0091-1} {AI05-0227-1} After applying simple case
folding, an identifier shall not be identical to a reserved word.

5.b/3       Implementation Note: We match the reserved words after applying
            case folding so that the rules for identifiers and reserved words
            are the same. Since a compiler usually will lexically process
            identifiers and reserved words the same way (often with the same
            code), this will prevent a lot of headaches.

5.c/3       Ramification: {AI05-0227-1} The rules for reserved words differ in
            one way: they define case conversion on letters rather than
            sequences. This means that it is possible that there exist some
            unusual sequences that are neither identifiers nor reserved words.
            We are not aware of any such sequences so long as we use simple
            case folding (as opposed to full case folding), but we have
            defined the rules in case any are introduced in future character
            set standards. This originally was a problem when converting to
            upper case: "if" and "acceß" have upper case conversions of "
            IF" and "ACCESS" respectively. We would not want these to be treated
            as reserved words. But neither of these cases exist when using
            simple case folding.


                         Implementation Permissions

6   In a nonstandard mode, an implementation may support other upper/lower
case equivalence rules for identifiers[, to accommodate local conventions].

6.a/3       Discussion: {AI95-00285-01} {AI05-0227-1} For instance, in most
            languages, the simple case folded equivalent of LATIN CAPITAL
            LETTER I (an upper case letter without a dot above) is LATIN SMALL
            LETTER I (a lower case letter with a dot above). In Turkish,
            though, LATIN CAPITAL LETTER I and LATIN CAPITAL LETTER I WITH DOT
            ABOVE are two distinct letters, so the case folded equivalent of
            LATIN CAPITAL LETTER I is LATIN SMALL LETTER DOTLESS I, and the
            case folded equivalent of LATIN CAPITAL LETTER I WITH DOT ABOVE is
            LATIN SMALL LETTER I. Take for instance the following identifier
            (which is the name of a city on the Tigris river in Eastern
            Anatolia):

6.b/3           DIYARBAKIR -- The first i is dotted, the second isn't.

6.c/3       A Turkish reader would expect that the above identifier is
            equivalent to:

6.d/3           diyarbakir

6.d.1/3     However, locale-independent simple case folding (and thus Ada)
            maps this to:

6.d.2/3         dIyarbakir

6.e/3       which is different from any of the following identifiers:

6.f/2           diyarbakir
                diyarbakir
                diyarbakir
                diyarbakir

6.f.1/3     including the "correct" matching identifier for Turkish. Upper
            case conversion (used in '[Wide_]Wide_Image) introduces additional
            problems.

6.g/3       An implementation targeting the Turkish market is allowed (in
            fact, expected) to provide a nonstandard mode where case folding
            is appropriate for Turkish.

6.j/2       Lithuanian and Azeri are two other languages that present similar
            idiosyncrasies.

        NOTES

6.1/2   2  {AI95-00285-01} Identifiers differing only in the use of
        corresponding upper and lower case letters are considered the same.


                                  Examples

7   Examples of identifiers:

8/2     {AI95-00433-01} Count      X    Get_Symbol   Ethelyn   Marion
        Snobol_4   X1   Page_Count   Store_Next_Item
        <Unicode-928><Unicode-955><Unicode-940><Unicode-964><Unicode-969>
        <Unicode-957>      -- Plato
        <Unicode-1063><Unicode-1072><Unicode-1081><Unicode-1082>
        <Unicode-1086><Unicode-1074><Unicode-1089><Unicode-1082>
        <Unicode-1080><Unicode-1081>  -- Tchaikovsky
        <Unicode-952>  <Unicode-966>        -- Angles


                         Wording Changes from Ada 83

8.a         We no longer include reserved words as identifiers. This is not a
            language change. In Ada 83, identifier included reserved words.
            However, this complicated several other rules (for example,
            regarding implementation-defined attributes and pragmas, etc.). We
            now explicitly allow certain reserved words for attribute
            designators, to make up for the loss.

8.b         Ramification: Because syntax rules are relevant to overload
            resolution, it means that if it looks like a reserved word, it is
            not an identifier. As a side effect, implementations cannot use
            reserved words as implementation-defined attributes or pragma
            names.


                            Extensions to Ada 95

8.c/2       {AI95-00285-01} An identifier can use any letter defined by
            ISO-10646:2003, along with several other categories. This should
            ease programming in languages other than English.


                       Incompatibilities With Ada 2005

8.d/3       {AI05-0091-1} Correction: other_format characters were removed
            from identifiers as the Unicode recommendations have changed. This
            change can only affect programs written for the original Ada 2005,
            so there should be few such programs.

8.e/3       {AI05-0227-1} Correction: We now specify simple case folding
            rather than full case folding. That potentially could change
            identifier equivalence, although it is more likely that
            identifiers that are considered the same in original Ada 2005 will
            now be considered different. This change was made because the
            original Ada 2005 definition was incompatible (and even
            inconsistent in unusual cases) with the Ada 95 identifier
            equivalence rules. As such, the Ada 2005 rules were rarely fully
            implemented, and in any case, only Ada 2005 identifiers containing
            wide characters could be affected.


2.4 Numeric Literals


1   There are two kinds of numeric_literals, real literals and integer
literals. A real literal is a numeric_literal that includes a point; an
integer literal is a numeric_literal without a point.


                                   Syntax

2       numeric_literal ::= decimal_literal | based_literal

        NOTES

3       3  The type of an integer literal is universal_integer. The type of a
        real literal is universal_real.


2.4.1 Decimal Literals


1   A decimal_literal is a numeric_literal in the conventional decimal
notation (that is, the base is ten).


                                   Syntax

2       decimal_literal ::= numeral [.numeral] [exponent]

3       numeral ::= digit {[underline] digit}

4       exponent ::= E [+] numeral | E - numeral

4.1/2   {AI95-00285-01} digit ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

5       An exponent for an integer literal shall not have a minus sign.

5.a         Ramification: Although this rule is in this subclause, it applies
            also to the next subclause.


                              Static Semantics

6   An underline character in a numeric_literal does not affect its meaning.
The letter E of an exponent can be written either in lower case or in upper
case, with the same meaning.

6.a         Ramification: Although these rules are in this subclause, they
            apply also to the next subclause.

7   An exponent indicates the power of ten by which the value of the
decimal_literal without the exponent is to be multiplied to obtain the value
of the decimal_literal with the exponent.


                                  Examples

8   Examples of decimal literals:

9       12        0      1E6    123_456    --  integer literals
        
        12.0      0.0    0.456  3.14159_26 --  real literals


                         Wording Changes from Ada 83

9.a         We have changed the syntactic category name integer to be
            numeral. We got this idea from ACID. It avoids the confusion
            between this and integers. (Other places don't offer similar
            confusions. For example, a string_literal is different from a
            string.)


2.4.2 Based Literals


1   [ A based_literal is a numeric_literal expressed in a form that specifies
the base explicitly.]


                                   Syntax

2       based_literal ::= 
           base # based_numeral [.based_numeral] # [exponent]

3       base ::= numeral

4       based_numeral ::= 
           extended_digit {[underline] extended_digit}

5       extended_digit ::= digit | A | B | C | D | E | F


                               Legality Rules

6   The base (the numeric value of the decimal numeral preceding the first #)
shall be at least two and at most sixteen. The extended_digits A through F
represent the digits ten through fifteen, respectively. The value of each
extended_digit of a based_literal shall be less than the base.


                              Static Semantics

7   The conventional meaning of based notation is assumed. An exponent
indicates the power of the base by which the value of the based_literal
without the exponent is to be multiplied to obtain the value of the
based_literal with the exponent. The base and the exponent, if any, are in
decimal notation.

8   The extended_digits A through F can be written either in lower case or in
upper case, with the same meaning.


                                  Examples

9   Examples of based literals:

10      2#1111_1111#  16#FF#       016#0ff#   --  integer literals of value 255
        16#E#E1       2#1110_0000#            --  integer literals of value 224
        16#F.FF#E+2   2#1.1111_1111_1110#E11  --  real literals of value 4095.0


                         Wording Changes from Ada 83

10.a        The rule about which letters are allowed is now encoded in BNF, as
            suggested by Mike Woodger. This is clearly more readable.


2.5 Character Literals


1   [A character_literal is formed by enclosing a graphic character between
two apostrophe characters.]


                                   Syntax

2       character_literal ::= 'graphic_character'

        NOTES

3       4  A character_literal is an enumeration literal of a character type.
        See 3.5.2.


                                  Examples

4   Examples of character literals:

5/2     {AI95-00433-01} 'A'     '*'     '''     ' '
        'L'     '<Unicode-1051>'     '<Unicode-923>'    -- Various els.
        '<Unicode-8734>'     '<Unicode-1488>
        '            -- Big numbers - infinity and aleph.


                         Wording Changes from Ada 83

5.a/3       {AI05-0299-1} The definitions of the values of literals are in
            Clauses 3 and 4, rather than here, since it requires knowledge of
            types.


2.6 String Literals


1   [A string_literal is formed by a sequence of graphic characters (possibly
none) enclosed between two quotation marks used as string brackets. They are
used to represent operator_symbols (see 6.1), values of a string type (see
4.2), and array subaggregates (see 4.3.3). ]


                                   Syntax

2       string_literal ::= "{string_element}"

3       string_element ::= "" | non_quotation_mark_graphic_character

4       A string_element is either a pair of quotation marks (""), or a single
        graphic_character other than a quotation mark.


                              Static Semantics

5   The sequence of characters of a string_literal is formed from the sequence
of string_elements between the bracketing quotation marks, in the given order,
with a string_element that is "" becoming a single quotation mark in the
sequence of characters, and any other string_element being reproduced in the
sequence.

6   A null string literal is a string_literal with no string_elements between
the quotation marks.

        NOTES

7       5  An end of line cannot appear in a string_literal.

7.1/2   6  {AI95-00285-01} No transformation is performed on the sequence of
        characters of a string_literal.


                                  Examples

8   Examples of string literals:

9/2     {AI95-00433-01} "Message of the day:"
        
        ""                    --  a null string literal
        " "   "A"   """"      --  three string literals of length 1
        
        "Characters such as $, %, and } are allowed in string literals"
        "Archimedes said ""<Unicode-917><Unicode-973><Unicode-961>
        <Unicode-951><Unicode-954><Unicode-945>"""
        "Volume of cylinder (PIr²h) = "


                         Wording Changes from Ada 83

9.a         The wording has been changed to be strictly lexical. No mention is
            made of string or character values, since string_literals are also
            used to represent operator_symbols, which don't have a defined
            value.

9.b         The syntax is described differently.


                         Wording Changes from Ada 95

9.c/2       {AI95-00285-01} We explicitly say that the characters of a
            string_literal should be used as is. In particular, no
            normalization or folding should be performed on a string_literal.


2.7 Comments


1   A comment starts with two adjacent hyphens and extends up to the end of
the line.


                                   Syntax

2       comment ::= --{non_end_of_line_character}

3       A comment may appear on any line of a program.


                              Static Semantics

4   The presence or absence of comments has no influence on whether a program
is legal or illegal. Furthermore, comments do not influence the meaning of a
program; their sole purpose is the enlightenment of the human reader.


                                  Examples

5   Examples of comments:

6       --  the last sentence above echoes the Algol 68 report 
        
        end;  --  processing of Line is complete 
        
        --  a long comment may be split onto
        --  two or more consecutive lines   
        
        ----------------  the first two hyphens start the comment  


2.8 Pragmas


1   A pragma is a compiler directive. There are language-defined pragmas that
give instructions for optimization, listing control, etc. An implementation
may support additional (implementation-defined) pragmas.


                         Language Design Principles

1.a/3       {AI05-0100-1} {AI05-0163-1} In general, if all pragmas are treated
            as unrecognized pragmas, the program should remain both
            syntactically and semantically legal. There are a few exceptions
            to this general principle (for example, pragma Import can
            eliminate the need for a completion), but the principle remains,
            and is strictly true at the syntactic level. Certainly any
            implementation-defined pragmas should obey this principle both
            syntactically and semantically, so that if the pragmas are not
            recognized by some other implementation, the program will remain
            legal.


                                   Syntax

2       pragma ::= 
           pragma identifier [(pragma_argument_association
         {, pragma_argument_association})];

3/3     {AI05-0290-1} pragma_argument_association ::= 
             [pragma_argument_identifier =>] name
           | [pragma_argument_identifier =>] expression
           | pragma_argument_aspect_mark =>  name
           | pragma_argument_aspect_mark =>  expression

4/3     {AI05-0290-1} In a pragma, any pragma_argument_associations without a
        pragma_argument_identifier or pragma_argument_aspect_mark shall
        precede any associations with a pragma_argument_identifier or
        pragma_argument_aspect_mark.

5       Pragmas are only allowed at the following places in a program:

6         * After a semicolon delimiter, but not within a formal_part or
            discriminant_part.

7/3       * {AI05-0100-1} {AI05-0163-1} At any place where the syntax rules
            allow a construct defined by a syntactic category whose name ends
            with "declaration", "item", "statement", "clause", or "
            alternative", or one of the syntactic categories variant or
            exception_handler; but not in place of such a construct if the
            construct is required, or is part of a list that is required to
            have at least one such construct.

7.1/3     * {AI05-0163-1} In place of a statement in a
            sequence_of_statements.

7.2/3     * {AI05-0100-1} At any place where a compilation_unit is allowed.

8       Additional syntax rules and placement restrictions exist for specific
        pragmas.

8.a         Discussion: The above rule is written in text, rather than in BNF;
            the syntactic category pragma is not used in any BNF syntax rule.

8.b         Ramification: A pragma is allowed where a
            generic_formal_parameter_declaration is allowed.

9   The name of a pragma is the identifier following the reserved word pragma.
The name or expression of a pragma_argument_association is a pragma argument.

9.a/2       To be honest: {AI95-00284-02} For compatibility with Ada 83, the
            name of a pragma may also be "interface", which is not an
            identifier (because it is a reserved word). See J.12.

10/3 {AI05-0272-1} An identifier specific to a pragma is an identifier or
reserved word that is used in a pragma argument with special meaning for that
pragma.

10.a        To be honest: Whenever the syntax rules for a given pragma allow
            "identifier" as an argument of the pragma, that identifier is an
            identifier specific to that pragma.

10.b/3      {AI05-0272-1} In a few cases, a reserved word is allowed as "an
            identifier specific to a pragma". Even in these cases, the syntax
            still is written as identifier (the reserved word(s) are not
            shown). For example, the restriction No_Use_Of_Attribute (see
            13.12.1) allows the reserved words which can be attribute
            designators, but the syntax for a restriction does not include
            these reserved words.


                              Static Semantics

11  If an implementation does not recognize the name of a pragma, then it has
no effect on the semantics of the program. Inside such a pragma, the only
rules that apply are the Syntax Rules.

11.a        To be honest: This rule takes precedence over any other rules that
            imply otherwise.

11.b        Ramification: Note well: this rule applies only to pragmas whose
            name is not recognized. If anything else is wrong with a pragma
            (at compile time), the pragma is illegal. This is true whether the
            pragma is language defined or implementation defined.

11.c        For example, an expression in an unrecognized pragma does not
            cause freezing, even though the rules in 13.14, "Freezing Rules
            " say it does; the above rule overrules those other rules. On the
            other hand, an expression in a recognized pragma causes freezing,
            even if this makes something illegal.

11.d        For another example, an expression that would be ambiguous is not
            illegal if it is inside an unrecognized pragma.

11.e        Note, however, that implementations have to recognize pragma
            Inline(Foo) and freeze things accordingly, even if they choose to
            never do inlining.

11.f        Obviously, the contradiction needs to be resolved one way or the
            other. The reasons for resolving it this way are: The
            implementation is simple - the compiler can just ignore the
            pragma altogether. The interpretation of constructs appearing
            inside implementation-defined pragmas is implementation defined.
            For example: "pragma Mumble(X);". If the current implementation
            has never heard of Mumble, then it doesn't know whether X is a
            name, an expression, or an identifier specific to the pragma
            Mumble.

11.g        To be honest: The syntax of individual pragmas overrides the
            general syntax for pragma.

11.h        Ramification: Thus, an identifier specific to a pragma is not a
            name, syntactically; if it were, the visibility rules would be
            invoked, which is not what we want.

11.i/3      {AI05-0229-1} This also implies that named associations do not
            allow one to give the arguments in an arbitrary order - the order
            given in the syntax rule for each individual pragma must be
            obeyed. However, it is generally possible to leave out earlier
            arguments when later ones are given; for example, this is allowed
            by the syntax rule for pragma Import (see J.15.5, "
            Interfacing Pragmas"). As for subprogram calls, positional
            notation precedes named notation.

11.j        Note that Ada 83 had no pragmas for which the order of named
            associations mattered, since there was never more than one
            argument that allowed named associations.

11.k        To be honest: The interpretation of the arguments of
            implementation-defined pragmas is implementation defined. However,
            the syntax rules have to be obeyed.


                              Dynamic Semantics

12  Any pragma that appears at the place of an executable construct is
executed. Unless otherwise specified for a particular pragma, this execution
consists of the evaluation of each evaluable pragma argument in an arbitrary
order.

12.a        Ramification: For a pragma that appears at the place of an
            elaborable construct, execution is elaboration.

12.b        An identifier specific to a pragma is neither a name nor an
            expression - such identifiers are not evaluated (unless an
            implementation defines them to be evaluated in the case of an
            implementation-defined pragma).

12.c        The "unless otherwise specified" part allows us (and
            implementations) to make exceptions, so a pragma can contain an
            expression that is not evaluated. Note that pragmas in
            type_definitions may contain expressions that depend on
            discriminants.

12.d        When we wish to define a pragma with some run-time effect, we
            usually make sure that it appears in an executable context;
            otherwise, special rules are needed to define the run-time effect
            and when it happens.


                         Implementation Requirements

13  The implementation shall give a warning message for an unrecognized pragma
name.

13.a        Ramification: An implementation is also allowed to have modes in
            which a warning message is suppressed, or in which the presence of
            an unrecognized pragma is a compile-time error.


                         Implementation Permissions

14  An implementation may provide implementation-defined pragmas; the name of
an implementation-defined pragma shall differ from those of the
language-defined pragmas.

14.a        Implementation defined: Implementation-defined pragmas.

14.b        Ramification: The semantics of implementation-defined pragmas, and
            any associated rules (such as restrictions on their placement or
            arguments), are, of course, implementation defined.
            Implementation-defined pragmas may have run-time effects.

15  An implementation may ignore an unrecognized pragma even if it violates
some of the Syntax Rules, if detecting the syntax error is too complex.

15.a        Reason: Many compilers use extra post-parsing checks to enforce
            the syntax rules, since the Ada syntax rules are not LR(k) (for
            any k). (The grammar is ambiguous, in fact.) This paragraph allows
            them to ignore an unrecognized pragma, without having to perform
            such post-parsing checks.


                            Implementation Advice

16/3 {AI05-0163-1} Normally, implementation-defined pragmas should have no
semantic effect for error-free programs; that is, if the
implementation-defined pragmas in a working program are replaced with
unrecognized pragmas, the program should still be legal, and should still have
the same semantics.

16.a.1/2    Implementation Advice: Implementation-defined pragmas should have
            no semantic effect for error-free programs.

16.a        Ramification: Note that "semantics" is not the same as "
            effect;" as explained in 1.1.3, the semantics defines a set of possible
            effects.

16.b        Note that adding a pragma to a program might cause an error
            (either at compile time or at run time). On the other hand, if the
            language-specified semantics for a feature are in part
            implementation defined, it makes sense to support pragmas that
            control the feature, and that have real semantics; thus, this
            paragraph is merely a recommendation.

17  Normally, an implementation should not define pragmas that can make an
illegal program legal, except as follows:

18/3   * {AI05-0229-1} A pragma used to complete a declaration;

18.a/3      Discussion: {AI05-0229-1} There are no language-defined pragmas
            which can be completions; pragma Import was defined this way in
            Ada 95 and Ada 2005, but in Ada 2012 pragma Import just sets
            aspect Import which disallows having any completion.

19    * A pragma used to configure the environment by adding, removing, or
        replacing library_items.

19.a.1/2    Implementation Advice: Implementation-defined pragmas should not
            make an illegal program legal, unless they complete a declaration
            or configure the library_items in an environment.

19.a        Ramification: For example, it is OK to support Interface,
            System_Name, Storage_Unit, and Memory_Size pragmas for upward
            compatibility reasons, even though all of these pragmas can make
            an illegal program legal. (The latter three can affect legality in
            a rather subtle way: They affect the value of named numbers in
            System, and can therefore affect the legality in cases where
            static expressions are required.)

19.b        On the other hand, adding implementation-defined pragmas to a
            legal program can make it illegal. For example, a common kind of
            implementation-defined pragma is one that promises some property
            that allows more efficient code to be generated. If the promise is
            a lie, it is best if the user gets an error message.


                        Incompatibilities With Ada 83

19.c        In Ada 83, "bad" pragmas are ignored. In Ada 95, they are illegal,
            except in the case where the name of the pragma itself is not
            recognized by the implementation.


                            Extensions to Ada 83

19.d        Implementation-defined pragmas may affect the legality of a
            program.


                         Wording Changes from Ada 83

19.e        Implementation-defined pragmas may affect the run-time semantics
            of the program. This was always true in Ada 83 (since it was not
            explicitly forbidden by RM83), but it was not clear, because there
            was no definition of "executing" or "elaborating" a pragma.


                           Extensions to Ada 2005

19.f/3      {AI05-0163-1} Correction: Allow pragmas in place of a statement,
            even if there are no other statements in a
            sequence_of_statements.

19.g/3      {AI05-0272-1} Identifiers specific to a pragma can be reserved
            words.

19.h/3      {AI05-0290-1} Pragma arguments can be identified with
            aspect_marks; this allows identifier'Class in this context. As
            usual, this is only allowed if specifically allowed by a
            particular pragma.


                        Wording Changes from Ada 2005

19.i/3      {AI05-0100-1} Correction: Clarified where pragmas are (and are
            not) allowed.


                                   Syntax

20      The forms of List, Page, and Optimize pragmas are as follows:

21        pragma List(identifier);

22        pragma Page;

23        pragma Optimize(identifier);

24      [Other pragmas are defined throughout this International Standard, and
        are summarized in Annex L.]

24.a        Ramification: The language-defined pragmas are supported by every
            implementation, although "supporting" some of them (for example,
            Inline) requires nothing more than checking the arguments, since
            they act only as advice to the implementation.


                              Static Semantics

25  A pragma List takes one of the identifiers On or Off as the single
argument. This pragma is allowed anywhere a pragma is allowed. It specifies
that listing of the compilation is to be continued or suspended until a List
pragma with the opposite argument is given within the same compilation. The
pragma itself is always listed if the compiler is producing a listing.

26  A pragma Page is allowed anywhere a pragma is allowed. It specifies that
the program text which follows the pragma should start on a new page (if the
compiler is currently producing a listing).

27  A pragma Optimize takes one of the identifiers Time, Space, or Off as the
single argument. This pragma is allowed anywhere a pragma is allowed, and it
applies until the end of the immediately enclosing declarative region, or for
a pragma at the place of a compilation_unit, to the end of the compilation. It
gives advice to the implementation as to whether time or space is the primary
optimization criterion, or that optional optimizations should be turned off.
[It is implementation defined how this advice is followed.]

27.a        Implementation defined: Effect of pragma Optimize.

27.b        Discussion: For example, a compiler might use Time vs. Space to
            control whether generic instantiations are implemented with a
            macro-expansion model, versus a shared-generic-body model.

27.c        We don't define what constitutes an "optimization" - in fact, it
            cannot be formally defined in the context of Ada. One compiler
            might call something an optional optimization, whereas another
            compiler might consider that same thing to be a normal part of
            code generation. Thus, the programmer cannot rely on this pragma
            having any particular portable effect on the generated code. Some
            compilers might even ignore the pragma altogether.


                                  Examples

28  Examples of pragmas:

29/3    {AI95-00433-01} {AI05-0229-1}
        pragma List(Off); -- turn off listing generation
        pragma Optimize(Off); -- turn off optional optimizations
        pragma Pure(Rational_Numbers); -- set categorization for package
        pragma Assert(Exists(File_Name),
                      Message => "Nonexistent file"); -- assert file exists


                            Extensions to Ada 83

29.a        The Optimize pragma now allows the identifier Off to request that
            normal optimization be turned off.

29.b        An Optimize pragma may appear anywhere pragmas are allowed.


                         Wording Changes from Ada 83

29.c        We now describe the pragmas Page, List, and Optimize here, to act
            as examples, and to remove the normative material from Annex L
            , "Language-Defined Pragmas", so it can be entirely an informative
            annex.


                         Wording Changes from Ada 95

29.d/2      {AI95-00433-01} Updated the example of named pragma parameters,
            because the second parameter of pragma Suppress is obsolescent.


                        Wording Changes from Ada 2005

29.e/3      {AI05-0229-1} Updated the example of pragmas, because both
            pragmas Inline and Import are obsolescent.


2.9 Reserved Words



                                   Syntax

1/1     This paragraph was deleted.

2/3     {AI95-00284-02} {AI95-00395-01} {AI05-0091-1} The following are the
        reserved words. Within a program, some or all of the letters of a
        reserved word may be in upper case.

2.a         Discussion: Reserved words have special meaning in the syntax. In
            addition, certain reserved words are used as attribute names.

2.b         The syntactic category identifier no longer allows reserved words.
            We have added the few reserved words that are legal explicitly to
            the syntax for attribute_reference. Allowing identifier to include
            reserved words has been a source of confusion for some users, and
            differs from the way they are treated in the C and Pascal language
            definitions.

            abort
            abs
            abstract
            accept
            access
            aliased
            all
            and
            array
            at

            begin
            body

            case
            constant

            declare
            delay
            delta
            digits
            do


            else
            elsif
            end
            entry
            exception
            exit

            for
            function

            generic
            goto

            if
            in
            interface
            is

            limited
            loop

            mod


            new
            not
            null

            of
            or
            others
            out
            overriding

            package
            pragma
            private
            procedure
            protected

            raise
            range
            record
            rem
            renames
            requeue


            return
            reverse

            select
            separate
            some
            subtype
            synchronized

            tagged
            task
            terminate
            then
            type

            until
            use

            when
            while
            with

            xor

        NOTES

3       7  The reserved words appear in lower case boldface in this
        International Standard, except when used in the designator of an
        attribute (see 4.1.4). Lower case boldface is also used for a reserved
        word in a string_literal used as an operator_symbol. This is merely a
        convention - programs may be written in whatever typeface is desired
        and available.


                        Incompatibilities With Ada 83

3.a         The following words are not reserved in Ada 83, but are reserved
            in Ada 95: abstract, aliased, protected, requeue, tagged, until.


                         Wording Changes from Ada 83

3.b/3       {AI05-0299-1} The subclause entitled "Allowed Replacements of
            Characters" has been moved to Annex J, "Obsolescent Features".


                        Incompatibilities With Ada 95

3.c/2       {AI95-00284-02} The following words are not reserved in Ada 95,
            but are reserved in Ada 2005: interface, overriding, synchronized.
            A special allowance is made for pragma Interface (see J.12). Uses
            of these words as identifiers will need to be changed, but we do
            not expect them to be common.


                         Wording Changes from Ada 95

3.d/2       {AI95-00395-01} The definition of upper case equivalence has been
            modified to allow identifiers using all of the characters of ISO
            10646. This change has no effect on the character sequences that
            are reserved words, but does make some unusual sequences of
            characters illegal.


                       Incompatibilities With Ada 2005

3.e/3       {AI05-0091-1} Correction: Removed other_format characters from
            reserved words in order to be compatible with the latest Unicode
            recommendations. This change can only affect programs written for
            original Ada 2005, and there is little reason to put other_format
            characters into reserved words in the first place, so there should
            be very few such programs.

3.f/3       {AI05-0176-1} The following word is not reserved in Ada 2005, but
            is reserved in Ada 2012: some. Uses of this word as an identifier
            will need to be changed, but we do not expect them to be common.
Generated by dwww version 1.15 on Mon Jun 24 14:22:08 CEST 2024.