/usr/share/doc/ada-reference-manual-2005/aarm2005.txt/aa-02.TXT

dwww Home | Show directory contents | Find package

Section 2: Lexical Elements

1 [The text of a program consists of the texts of one or more compilations.
The text of a compilation is a sequence of lexical elements, each composed of
characters; the rules of composition are given in this section. Pragmas, which
provide certain information for the compiler, are also described in this
section.]

2.1 Character Set

1/2 {AI95-00285-01} {AI95-00395-01} {character set} The character repertoire
for the text of an Ada program consists of the entire coding space described
by the ISO/IEC 10646:2003 Universal Multiple-Octet Coded Character Set. This
coding space is organized in planes, each plane comprising 65536
characters.{plane (character)} {character plane}

1.a/2 This paragraph was deleted.{AI95-00285-01}

1.b/2 This paragraph was deleted.{AI95-00285-01}

1.c/2 Discussion: {AI95-00285-01} It is our intent to follow the
terminology of ISO/IEC 10646:2003 where appropriate, and to remain
compatible with the character classifications defined in A.3, "
Character Handling".

Syntax

Paragraphs 2 and 3 were deleted.

3.1/2 {AI95-00285-01} {AI95-00395-01} A character is defined by this
International Standard for each cell in the coding space described by
ISO/IEC 10646:2003, regardless of whether or not ISO/IEC 10646:2003
allocates a character to that cell.

Static Semantics

4/2 {AI95-00285-01} {AI95-00395-01} The coded representation for characters is
implementation defined [(it need not be a representation defined within
ISO/IEC 10646:2003)]. A character whose relative code position in its plane is
16#FFFE# or 16#FFFF# is not allowed anywhere in the text of a program.

4.a Implementation defined: The coded representation for the text of
an Ada program.

4.b/2 Ramification: {AI95-00285-01} Note that this rule doesn't really
have much force, since the implementation can represent characters
in the source in any way it sees fit. For example, an
implementation could simply define that what seems to be an
other_private_use character is actually a representation of the
space character.

4.1/2 {AI95-00285-01} The semantics of an Ada program whose text is not in
Normalization Form KC (as defined by section 24 of ISO/IEC 10646:2003) is
implementation defined.

4.c/2 Implementation defined: The semantics of an Ada program whose text
is not in Normalization Form KC.

5/2 {AI95-00285-01} The description of the language definition in this
International Standard uses the character properties General Category, Simple
Uppercase Mapping, Uppercase Mapping, and Special Case Condition of the
documents referenced by the note in section 1 of ISO/IEC 10646:2003. The
actual set of graphic symbols used by an implementation for the visual
representation of the text of an Ada program is not specified. {unspecified
[partial]}

6/2 {AI95-00285-01} Characters are categorized as follows:

6.a/2 Discussion: Our character classification considers that the cells
not allocated in ISO/IEC 10646:2003 are graphic characters, except
for those whose relative code position in their plane is 16#FFFE#
or 16#FFFF#. This seems to provide the best compatibility with
future versions of ISO/IEC 10646, as future characters can be
already be used in Ada character and string literals.

7/2 This paragraph was deleted.{AI95-00285-01}

8/2 {AI95-00285-01} {letter_uppercase} letter_uppercase
Any character whose General Category is defined to be "
Letter, Uppercase".

9/2 {AI95-00285-01} {letter_lowercase} letter_lowercase
Any character whose General Category is defined to be "
Letter, Lowercase".

9.a/1 This paragraph was deleted.{8652/0001} {AI95-00124-01}

9.1/2 {AI95-00285-01} {letter_titlecase} letter_titlecase
Any character whose General Category is defined to be "
Letter, Titlecase".

9.2/2 {AI95-00285-01} {letter_modifier} letter_modifier
Any character whose General Category is defined to be "
Letter, Modifier".

9.3/2 {AI95-00285-01} {letter_other} letter_other
Any character whose General Category is defined to be "
Letter, Other".

9.4/2 {AI95-00285-01} {mark_non_spacing} mark_non_spacing
Any character whose General Category is defined to be "Mark,
Non-Spacing".

9.5/2 {AI95-00285-01} {mark_non_spacing} mark_spacing_combining
Any character whose General Category is defined to be "Mark,
Spacing Combining".

10/2 {AI95-00285-01} {number_decimal} number_decimal
Any character whose General Category is defined to be "
Number, Decimal".

10.1/2 {AI95-00285-01} {number_letter} number_letter
Any character whose General Category is defined to be "
Number, Letter".

10.2/2 {AI95-00285-01} {punctuation_connector} punctuation_connector
Any character whose General Category is defined to be "
Punctuation, Connector".

10.3/2 {AI95-00285-01} {other_format} other_format
Any character whose General Category is defined to be "Other,
Format".

11/2 {AI95-00285-01} {separator_space} separator_space
Any character whose General Category is defined to be "
Separator, Space".

12/2 {AI95-00285-01} {separator_line} separator_line
Any character whose General Category is defined to be "
Separator, Line".

12.1/2 {AI95-00285-01} {separator_paragraph} separator_paragraph
Any character whose General Category is defined to be "
Separator, Paragraph".

13/2 {AI95-00285-01} {format_effector} format_effector
The characters whose code positions are 16#09# (CHARACTER
TABULATION), 16#0A# (LINE FEED), 16#0B# (LINE TABULATION),
16#0C# (FORM FEED), 16#0D# (CARRIAGE RETURN), 16#85# (NEXT
LINE), and the characters in categories separator_line and
separator_paragraph. {control character: See also format_effector
}

13.a/2 Discussion: ISO/IEC 10646:2003 does not define the names of
control characters, but rather refers to the names defined by
ISO/IEC 6429:1992. These are the names that we use here.

13.1/2 {AI95-00285-01} {other_control} other_control
Any character whose General Category is defined to be "Other,
Control", and which is not defined to be a format_effector.

13.2/2 {AI95-00285-01} {other_private_use} other_private_use
Any character whose General Category is defined to be "Other,
Private Use".

13.3/2 {AI95-00285-01} {other_surrogate} other_surrogate
Any character whose General Category is defined to be "Other,
Surrogate".

14/2 {AI95-00285-01} {AI95-00395-01} {graphic_character} graphic_character
Any character that is not in the categories other_control,
other_private_use, other_surrogate, format_effector, and whose
relative code position in its plane is neither 16#FFFE# nor
16#FFFF#.

14.a/2 This paragraph was deleted.

14.b/2 Discussion: {AI95-00285-01} We considered basing the definition of
lexical elements on Annex A of ISO/IEC TR 10176 (4th edition),
which lists the characters which should be supported in
identifiers for all programming languages, but we finally decided
against this option. Note that it is not our intent to diverge
from ISO/IEC TR 10176, except to the extent that ISO/IEC TR 10176
itself diverges from ISO/IEC 10646:2003 (which is the case at the
time of this writing [January 2005]).

14.c/2 More precisely, we intend to align strictly with ISO/IEC
10646:2003. It must be noted that ISO/IEC TR 10176 is a Technical
Report while ISO/IEC 10646:2003 is a Standard. If one has to make
a choice, one should conform with the Standard rather than with
the Technical Report. And, it turns out that one must make a
choice because there are important differences between the two:

14.d/2 * ISO/IEC TR 10176 is still based on ISO/IEC 10646:2000 while
ISO/IEC 10646:2003 has already been published for a year. We
cannot afford to delay the adoption of our amendment until
ISO/IEC TR 10176 has been revised.

14.e/2 * There are considerable differences between the two editions of
ISO/IEC 10646, notably in supporting characters beyond the BMP
(this might be significant for some languages, e.g. Korean).

14.f/2 * ISO/IEC TR 10176 does not define case conversion tables, which
are essential for a case-insensitive language like Ada. To get
case conversion tables, we would have to reference either
ISO/IEC 10646:2003 or Unicode, or we would have to invent our
own.

14.g/2 For the purpose of defining the lexical elements of the language,
we need character properties like categorization, as well as case
conversion tables. These are mentioned in ISO/IEC 10646:2003 as
useful for implementations, with a reference to Unicode.
Machine-readable tables are available on the web at URLs:

14.h/2 http://www.unicode.org/Public/4.0-Update/UnicodeData-4.0.0.txt
http://www.unicode.org/Public/4.0-Update/CaseFolding-4.0.0.txt

14.i/2 with an explanatory document found at URL:

14.j/2 http://www.unicode.org/Public/4.0-Update/UCD-4.0.0.html

14.k/2 The actual text of the standard only makes specific references to
the corresponding clauses of ISO/IEC 10646:2003, not to Unicode.

15/2 {AI95-00285-01} The following names are used when referring to certain
characters (the first name is that given in ISO/IEC 10646:2003):
{quotation mark} {number sign} {ampersand} {apostrophe} {tick}
{left parenthesis} {right parenthesis} {asterisk} {multiply} {plus sign}
{comma} {hyphen-minus} {minus} {full stop} {dot} {point} {solidus} {divide}
{colon} {semicolon} {less-than sign} {equals sign} {greater-than sign}
{low line} {underline} {vertical line} {exclamation point} {percent sign}

15.a/2 Discussion: {AI95-00285-01} {graphic symbols} {glyphs} This table
serves to show the correspondence between ISO/IEC 10646:2003 names
and the graphic symbols (glyphs) used in this International
Standard. These are the characters that play a special role in the
syntax of Ada.

graphic symbol

"
#
&
'
(
)
*
+
,
-
.

name

quotation mark
number sign
ampersand
apostrophe, tick
left parenthesis
right parenthesis
asterisk, multiply
plus sign
comma
hyphen-minus, minus
full stop, dot, point

graphic symbol

:
;
<
=
>
_
|
/
!
%

name

colon
semicolon
less-than sign
equals sign
greater-than sign
low line, underline
vertical line
solidus, divide
exclamation point
percent sign

Implementation Permissions

16/2 This paragraph was deleted.{AI95-00285-01}

NOTES

17/2 1 {AI95-00285-01} The characters in categories other_control,
other_private_use, and other_surrogate are only allowed in comments.

18 2 The language does not specify the source representation of
programs.

18.a/2 Discussion: Any source representation is valid so long as the
implementer can produce an (information-preserving) algorithm for
translating both directions between the representation and the
standard character set. (For example, every character in the
standard character set has to be representable, even if the output
devices attached to a given computer cannot print all of those
characters properly.) From a practical point of view, every
implementer will have to provide some way to process the ACATS. It
is the intent to allow source representations, such as parse
trees, that are not even linear sequences of characters. It is
also the intent to allow different fonts: reserved words might be
in bold face, and that should be irrelevant to the semantics.

Extensions to Ada 83

18.b {extensions to Ada 83} Ada 95 allows 8-bit and 16-bit characters,
as well as implementation-specified character sets.

Wording Changes from Ada 83

18.c/2 {AI95-00285-01} The syntax rules in this clause are modified to
remove the emphasis on basic characters vs. others. (In this day
and age, there is no need to point out that you can write programs
without using (for example) lower case letters.) In particular,
character (representing all characters usable outside comments) is
added, and basic_graphic_character, other_special_character, and
basic_character are removed. Special_character is expanded to
include Ada 83's other_special_character, as well as new 8-bit
characters not present in Ada 83. Ada 2005 removes
special_character altogether; we want to stick to ISO/IEC
10646:2003 character classifications. Note that the term "basic
letter" is used in A.3, "Character Handling" to refer to letters
without diacritical marks.

18.d/2 {AI95-00285-01} Character names now come from ISO/IEC 10646:2003.

18.e/2 This paragraph was deleted.{AI95-00285-01}

Extensions to Ada 95

18.f/2 {AI95-00285-01} {AI95-00395-01} {extensions to Ada 95} Program
text can use most characters defined by ISO-10646:2003. This
clause has been rewritten to use the categories defined in that
Standard. This should ease programming in languages other than
English.

2.2 Lexical Elements, Separators, and Delimiters

Static Semantics

1 {text of a program} The text of a program consists of the texts of one or
more compilations. {lexical element} {token: See lexical element} The text of
each compilation is a sequence of separate lexical elements. Each lexical
element is formed from a sequence of characters, and is either a delimiter, an
identifier, a reserved word, a numeric_literal, a character_literal, a
string_literal, or a comment. The meaning of a program depends only on the
particular sequences of lexical elements that form its compilations, excluding
comments.

2/2 {AI95-00285-01} The text of a compilation is divided into {line} lines.
{end of a line} In general, the representation for an end of line is
implementation defined. However, a sequence of one or more format_effectors
other than the character whose code position is 16#09# (CHARACTER TABULATION)
signifies at least one end of line.

2.a Implementation defined: The representation for an end of line.

3/2 {AI95-00285-01} {separator} [In some cases an explicit separator is
required to separate adjacent lexical elements.] A separator is any of a
separator_space, a format_effector, or the end of a line, as follows:

4/2 * {AI95-00285-01} A separator_space is a separator except within a
comment, a string_literal, or a character_literal.

5/2 * {AI95-00285-01} The character whose code position is 16#09# (CHARACTER
TABULATION) is a separator except within a comment.

6 * The end of a line is always a separator.

7 One or more separators are allowed between any two adjacent lexical
elements, before the first of each compilation, or after the last. At least
one separator is required between an identifier, a reserved word, or a
numeric_literal and an adjacent identifier, reserved word, or
numeric_literal.

8/2 {AI95-00285-01} {delimiter} A delimiter is either one of the following
characters:

9 & ' ( ) * + , - . / : ; < = > |

10 {compound delimiter} or one of the following compound delimiters each
composed of two adjacent special characters

11 => .. ** := /= >= <= << >> <>

12 Each of the special characters listed for single character delimiters is a
single delimiter except if this character is used as a character of a compound
delimiter, or as a character of a comment, string_literal, character_literal,
or numeric_literal.

13 The following names are used when referring to compound delimiters:

delimiter name

=> arrow
.. double dot
** double star, exponentiate
:= assignment (pronounced: "becomes")
/= inequality (pronounced: "not equal")
>= greater than or equal
<= less than or equal
<< left label bracket
>> right label bracket
<> box

Implementation Requirements

14 An implementation shall support lines of at least 200 characters in
length, not counting any characters used to signify the end of a line. An
implementation shall support lexical elements of at least 200 characters in
length. The maximum supported line length and lexical element length are
implementation defined.

14.a Implementation defined: Maximum supported line length and lexical
element length.

14.b Discussion: From URG recommendation.

Wording Changes from Ada 95

14.c/2 {AI95-00285-01} The wording was updated to use the new character
categories defined in the preceding clause.

2.3 Identifiers

1 Identifiers are used as names.

Syntax

2/2 {AI95-00285-01} {AI95-00395-01} identifier ::=
identifier_start {identifier_start | identifier_extend}

3/2 {AI95-00285-01} {AI95-00395-01} identifier_start ::=
letter_uppercase
| letter_lowercase
| letter_titlecase
| letter_modifier
| letter_other
| number_letter

3.1/2 {AI95-00285-01} {AI95-00395-01} identifier_extend ::=
mark_non_spacing
| mark_spacing_combining
| number_decimal
| punctuation_connector
| other_format

4/2 {AI95-00395-01} After eliminating the characters in category
other_format, an identifier shall not contain two consecutive
characters in category punctuation_connector, or end with a character
in that category.

4.a/2 Reason: This rule was stated in the syntax in Ada 95, but that has
gotten too complex in Ada 2005. Since other_format characters
usually do not display, we do not want to count them as separating
two underscores.

Static Semantics

5/2 {AI95-00285-01} Two identifiers are considered the same if they consist of
the same sequence of characters after applying the following transformations
(in this order):

5.1/2 * {AI95-00285-01} The characters in category other_format are
eliminated.

5.2/2 * {AI95-00285-01} {AI95-00395-01} The remaining sequence of characters
is converted to upper case. {case insensitive}

5.3/2 {AI95-00395-01} After applying these transformations, an identifier
shall not be identical to a reserved word (in upper case).

5.b/2 Implementation Note: We match the reserved words after doing these
transformations so that the rules for identifiers and reserved
words are the same. (This allows other_format characters, which
usually don't display, in a reserved word without changing it to
an identifier.) Since a compiler usually will lexically process
identifiers and reserved words the same way (often with the same
code), this will prevent a lot of headaches.

5.c/2 Ramification: The rules for reserved words differ in one way: they
define case conversion on letters rather than sequences. This
means that some unusual sequences are neither identifiers nor
reserved words. For instance, "if" and "acceß" have upper case
conversions of "IF" and "ACCESS" respectively. These are not
identifiers, because the transformed values are identical to a
reserved word. But they are not reserved words, either, because
the original values do not match any reserved word as defined or
with any number of characters of the reserved word in upper case.
Thus, these odd constructions are just illegal, and should not
appear in the source of a program.

Implementation Permissions

6 In a nonstandard mode, an implementation may support other upper/lower
case equivalence rules for identifiers[, to accommodate local conventions].

6.a/2 Discussion: {AI95-00285-01} For instance, in most languages, the
uppercase equivalent of LATIN SMALL LETTER I (a lower case letter
with a dot above) is LATIN CAPITAL LETTER I (an upper case letter
without a dot above). In Turkish, though, LATIN SMALL LETTER I and
LATIN SMALL LETTER DOTLESS I are two distinct letters, so the
upper case equivalent of LATIN SMALL LETTER I is LATIN CAPITAL
LETTER I WITH DOT ABOVE, and the upper case equivalent of LATIN
SMALL LETTER DOTLESS I is LATIN CAPITAL LETTER I. Take for
instance the following identifier (which is the name of a city on
the Tigris river in Eastern Anatolia):

6.b/2 diyarbakir -- The first i is dotted, the second isn't.

6.c/2 Locale-independent conversion to upper case results in:

6.d/2 DIYARBAKIR -- Both Is are dotless.

6.e/2 This means that the four following sequences of characters
represent the same identifier, even though for a locutor of
Turkish they would probably be considered distinct words:

6.f/2 diyarbakir
diyarbakir
diyarbakir
diyarbakir

6.g/2 An implementation targeting the Turkish market is allowed (in
fact, expected) to provide a nonstandard mode where case folding
is appropriate for Turkish. This would cause the original
identifier to be converted to:

6.h/2 DIYARBAKIR -- The first I is dotted, the second isn't.

6.i/2 and the four sequences of characters shown above would represent
four distinct identifiers.

6.j/2 Lithuanian and Azeri are two other languages that present similar
idiosyncrasies.

NOTES

6.1/2 3 {AI95-00285-01} Identifiers differing only in the use of
corresponding upper and lower case letters are considered the same.

Examples

7 Examples of identifiers:

8/2 {AI95-00433-01} Count X Get_Symbol Ethelyn Marion
Snobol_4 X1 Page_Count Store_Next_Item
<Unicode-928><Unicode-955><Unicode-940><Unicode-964><Unicode-969>
<Unicode-957> -- Plato
<Unicode-1063><Unicode-1072><Unicode-1081><Unicode-1082>
<Unicode-1086><Unicode-1074><Unicode-1089><Unicode-1082>
<Unicode-1080><Unicode-1081> -- Tchaikovsky
<Unicode-952> <Unicode-966> -- Angles

Wording Changes from Ada 83

8.a We no longer include reserved words as identifiers. This is not a
language change. In Ada 83, identifier included reserved words.
However, this complicated several other rules (for example,
regarding implementation-defined attributes and pragmas, etc.). We
now explicitly allow certain reserved words for attribute
designators, to make up for the loss.

8.b Ramification: Because syntax rules are relevant to overload
resolution, it means that if it looks like a reserved word, it is
not an identifier. As a side effect, implementations cannot use
reserved words as implementation-defined attributes or pragma
names.

Extensions to Ada 95

8.c/2 {AI95-00285-01} {extensions to Ada 95} An identifier can use any
letter defined by ISO-10646:2003, along with several other
categories. This should ease programming in languages other than
English.

2.4 Numeric Literals

1 {literal (numeric)} There are two kinds of numeric_literals, real literals
and integer literals. {real literal} A real literal is a numeric_literal that
includes a point; {integer literal} an integer literal is a numeric_literal
without a point.

Syntax

2 numeric_literal ::= decimal_literal | based_literal

NOTES

3 4 The type of an integer literal is universal_integer. The type of a
real literal is universal_real.

2.4.1 Decimal Literals

1 {literal (decimal)} A decimal_literal is a numeric_literal in the
conventional decimal notation (that is, the base is ten).

Syntax

2 decimal_literal ::= numeral [.numeral] [exponent]

3 numeral ::= digit {[underline] digit}

4 exponent ::= E [+] numeral | E - numeral

4.1/2 {AI95-00285-01} digit ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

5 An exponent for an integer literal shall not have a minus sign.

5.a Ramification: Although this rule is in this subclause, it applies
also to the next subclause.

Static Semantics

6 An underline character in a numeric_literal does not affect its meaning.
The letter E of an exponent can be written either in lower case or in upper
case, with the same meaning.

6.a Ramification: Although these rules are in this subclause, they
apply also to the next subclause.

7 An exponent indicates the power of ten by which the value of the
decimal_literal without the exponent is to be multiplied to obtain the value
of the decimal_literal with the exponent.

Examples

8 Examples of decimal literals:

9 12 0 1E6 123_456 -- integer literals

12.0 0.0 0.456 3.14159_26 -- real literals

Wording Changes from Ada 83

9.a We have changed the syntactic category name integer to be
numeral. We got this idea from ACID. It avoids the confusion
between this and integers. (Other places don't offer similar
confusions. For example, a string_literal is different from a
string.)

2.4.2 Based Literals

1 [ {literal (based)} {binary literal} {base 2 literal} {binary (literal)}
{octal literal} {base 8 literal} {octal (literal)} {hexadecimal literal}
{base 16 literal} {hexadecimal (literal)} A based_literal is a
numeric_literal expressed in a form that specifies the base explicitly.]

Syntax

2 based_literal ::=
base # based_numeral [.based_numeral] # [exponent]

3 base ::= numeral

4 based_numeral ::=
extended_digit {[underline] extended_digit}

5 extended_digit ::= digit | A | B | C | D | E | F

Legality Rules

6 {base} The base (the numeric value of the decimal numeral preceding the
first #) shall be at least two and at most sixteen. The extended_digits A
through F represent the digits ten through fifteen, respectively. The value of
each extended_digit of a based_literal shall be less than the base.

Static Semantics

7 The conventional meaning of based notation is assumed. An exponent
indicates the power of the base by which the value of the based_literal
without the exponent is to be multiplied to obtain the value of the
based_literal with the exponent. The base and the exponent, if any, are in
decimal notation.

8 The extended_digits A through F can be written either in lower case or in
upper case, with the same meaning.

Examples

9 Examples of based literals:

10 2#1111_1111# 16#FF# 016#0ff# -- integer literals of value 255
16#E#E1 2#1110_0000# -- integer literals of value 224
16#F.FF#E+2 2#1.1111_1111_1110#E11 -- real literals of value 4095.0

Wording Changes from Ada 83

10.a The rule about which letters are allowed is now encoded in BNF, as
suggested by Mike Woodger. This is clearly more readable.

2.5 Character Literals

1 [A character_literal is formed by enclosing a graphic character between
two apostrophe characters.]

Syntax

2 character_literal ::= 'graphic_character'

NOTES

3 5 A character_literal is an enumeration literal of a character type.
See 3.5.2.

Examples

4 Examples of character literals:

5/2 {AI95-00433-01} 'A' '*' ''' ' '
'L' '<Unicode-1051>' '<Unicode-923>' -- Various els.
'<Unicode-8734>' '<Unicode-1488>
' -- Big numbers - infinity and aleph.

Wording Changes from Ada 83

5.a The definitions of the values of literals are in Sections 3 and 4,
rather than here, since it requires knowledge of types.

2.6 String Literals

1 [A string_literal is formed by a sequence of graphic characters (possibly
none) enclosed between two quotation marks used as string brackets. They are
used to represent operator_symbols (see 6.1), values of a string type (see
4.2), and array subaggregates (see 4.3.3).
{quoted string: See string_literal} ]

Syntax

2 string_literal ::= "{string_element}"

3 string_element ::= "" | non_quotation_mark_graphic_character

4 A string_element is either a pair of quotation marks (""), or a single
graphic_character other than a quotation mark.

Static Semantics

5 {sequence of characters (of a string_literal)} The sequence of characters
of a string_literal is formed from the sequence of string_elements between the
bracketing quotation marks, in the given order, with a string_element that is
"" becoming a single quotation mark in the sequence of characters, and any
other string_element being reproduced in the sequence.

6 {null string literal} A null string literal is a string_literal with no
string_elements between the quotation marks.

NOTES

7 6 An end of line cannot appear in a string_literal.

7.1/2 7 {AI95-00285-01} No transformation is performed on the sequence of
characters of a string_literal.

Examples

8 Examples of string literals:

9/2 {AI95-00433-01} "Message of the day:"

"" -- a null string literal
" " "A" """" -- three string literals of length 1

"Characters such as $, %, and } are allowed in string literals"
"Archimedes said ""<Unicode-917><Unicode-973><Unicode-961>
<Unicode-951><Unicode-954><Unicode-945>"""
"Volume of cylinder (PIr²h) = "

Wording Changes from Ada 83

9.a The wording has been changed to be strictly lexical. No mention is
made of string or character values, since string_literals are also
used to represent operator_symbols, which don't have a defined
value.

9.b The syntax is described differently.

Wording Changes from Ada 95

9.c/2 {AI95-00285-01} We explicitly say that the characters of a
string_literal should be used as is. In particular, no
normalization or folding should be performed on a string_literal.

2.7 Comments

1 A comment starts with two adjacent hyphens and extends up to the end of
the line.

Syntax

2 comment ::= --{non_end_of_line_character}

3 A comment may appear on any line of a program.

Static Semantics

4 The presence or absence of comments has no influence on whether a program
is legal or illegal. Furthermore, comments do not influence the meaning of a
program; their sole purpose is the enlightenment of the human reader.

Examples

5 Examples of comments:

6 -- the last sentence above echoes the Algol 68 report

end; -- processing of Line is complete

-- a long comment may be split onto
-- two or more consecutive lines

---------------- the first two hyphens start the comment

2.8 Pragmas

1 {Pragma} [Glossary Entry]A pragma is a compiler directive. There are
language-defined pragmas that give instructions for optimization, listing
control, etc. An implementation may support additional
(implementation-defined) pragmas.

Syntax

2 pragma ::=
pragma identifier [(pragma_argument_association
{, pragma_argument_association})];

3 pragma_argument_association ::=
[pragma_argument_identifier =>] name
| [pragma_argument_identifier =>] expression

4 In a pragma, any pragma_argument_associations without a
pragma_argument_identifier shall precede any associations with a
pragma_argument_identifier.

5 Pragmas are only allowed at the following places in a program:

6 * After a semicolon delimiter, but not within a formal_part or
discriminant_part.

7 * At any place where the syntax rules allow a construct defined by a
syntactic category whose name ends with "declaration",
"statement", "clause", or "alternative", or one of the syntactic
categories variant or exception_handler; but not in place of such
a construct. Also at any place where a compilation_unit would be
allowed.

8 Additional syntax rules and placement restrictions exist for specific
pragmas.

8.a Discussion: The above rule is written in text, rather than in BNF;
the syntactic category pragma is not used in any BNF syntax rule.

8.b Ramification: A pragma is allowed where a
generic_formal_parameter_declaration is allowed.

9 {name (of a pragma)} {pragma name} The name of a pragma is the identifier
following the reserved word pragma. {pragma argument} {argument of a pragma}
The name or expression of a pragma_argument_association is a pragma argument.

9.a/2 To be honest: {AI95-00284-02} For compatibility with Ada 83, the
name of a pragma may also be "interface", which is not an
identifier (because it is a reserved word). See J.12.

10 {identifier specific to a pragma} {pragma, identifier specific to} An
identifier specific to a pragma is an identifier that is used in a pragma
argument with special meaning for that pragma.

10.a To be honest: Whenever the syntax rules for a given pragma allow
"identifier" as an argument of the pragma, that identifier is an
identifier specific to that pragma.

Static Semantics

11 If an implementation does not recognize the name of a pragma, then it has
no effect on the semantics of the program. Inside such a pragma, the only
rules that apply are the Syntax Rules.

11.a To be honest: This rule takes precedence over any other rules that
imply otherwise.

11.b Ramification: Note well: this rule applies only to pragmas whose
name is not recognized. If anything else is wrong with a pragma
(at compile time), the pragma is illegal. This is true whether the
pragma is language defined or implementation defined.

11.c For example, an expression in an unrecognized pragma does not
cause freezing, even though the rules in 13.14, "Freezing Rules
" say it does; the above rule overrules those other rules. On the
other hand, an expression in a recognized pragma causes freezing,
even if this makes something illegal.

11.d For another example, an expression that would be ambiguous is not
illegal if it is inside an unrecognized pragma.

11.e Note, however, that implementations have to recognize pragma
Inline(Foo) and freeze things accordingly, even if they choose to
never do inlining.

11.f Obviously, the contradiction needs to be resolved one way or the
other. The reasons for resolving it this way are: The
implementation is simple - the compiler can just ignore the
pragma altogether. The interpretation of constructs appearing
inside implementation-defined pragmas is implementation defined.
For example: "pragma Mumble(X);". If the current implementation
has never heard of Mumble, then it doesn't know whether X is a
name, an expression, or an identifier specific to the pragma
Mumble.

11.g To be honest: The syntax of individual pragmas overrides the
general syntax for pragma.

11.h Ramification: Thus, an identifier specific to a pragma is not a
name, syntactically; if it were, the visibility rules would be
invoked, which is not what we want.

11.i This also implies that named associations do not allow one to give
the arguments in an arbitrary order - the order given in the
syntax rule for each individual pragma must be obeyed. However, it
is generally possible to leave out earlier arguments when later
ones are given; for example, this is allowed by the syntax rule
for pragma Import (see B.1, "Interfacing Pragmas"). As for
subprogram calls, positional notation precedes named notation.

11.j Note that Ada 83 had no pragmas for which the order of named
associations mattered, since there was never more than one
argument that allowed named associations.

11.k To be honest: The interpretation of the arguments of
implementation-defined pragmas is implementation defined. However,
the syntax rules have to be obeyed.

Dynamic Semantics

12 {execution (pragma) [partial]} {elaboration (pragma) [partial]} Any
pragma that appears at the place of an executable construct is executed.
Unless otherwise specified for a particular pragma, this execution consists of
the evaluation of each evaluable pragma argument in an arbitrary order.

12.a Ramification: For a pragma that appears at the place of an
elaborable construct, execution is elaboration.

12.b An identifier specific to a pragma is neither a name nor an
expression - such identifiers are not evaluated (unless an
implementation defines them to be evaluated in the case of an
implementation-defined pragma).

12.c The "unless otherwise specified" part allows us (and
implementations) to make exceptions, so a pragma can contain an
expression that is not evaluated. Note that pragmas in
type_definitions may contain expressions that depend on
discriminants.

12.d When we wish to define a pragma with some run-time effect, we
usually make sure that it appears in an executable context;
otherwise, special rules are needed to define the run-time effect
and when it happens.

Implementation Requirements

13 The implementation shall give a warning message for an unrecognized pragma
name.

13.a Ramification: An implementation is also allowed to have modes in
which a warning message is suppressed, or in which the presence of
an unrecognized pragma is a compile-time error.

Implementation Permissions

14 An implementation may provide implementation-defined pragmas; the name of
an implementation-defined pragma shall differ from those of the
language-defined pragmas.

14.a Implementation defined: Implementation-defined pragmas.

14.b Ramification: The semantics of implementation-defined pragmas, and
any associated rules (such as restrictions on their placement or
arguments), are, of course, implementation defined.
Implementation-defined pragmas may have run-time effects.

15 An implementation may ignore an unrecognized pragma even if it violates
some of the Syntax Rules, if detecting the syntax error is too complex.

15.a Reason: Many compilers use extra post-parsing checks to enforce
the syntax rules, since the Ada syntax rules are not LR(k) (for
any k). (The grammar is ambiguous, in fact.) This paragraph allows
them to ignore an unrecognized pragma, without having to perform
such post-parsing checks.

Implementation Advice

16 Normally, implementation-defined pragmas should have no semantic effect
for error-free programs; that is, if the implementation-defined pragmas are
removed from a working program, the program should still be legal, and should
still have the same semantics.

16.a.1/2 Implementation Advice: Implementation-defined pragmas should have
no semantic effect for error-free programs.

16.a Ramification: Note that "semantics" is not the same as "
effect;" as explained in 1.1.3, the semantics defines a set of possible
effects.

16.b Note that adding a pragma to a program might cause an error
(either at compile time or at run time). On the other hand, if the
language-specified semantics for a feature are in part
implementation defined, it makes sense to support pragmas that
control the feature, and that have real semantics; thus, this
paragraph is merely a recommendation.

17 Normally, an implementation should not define pragmas that can make an
illegal program legal, except as follows:

18 * A pragma used to complete a declaration, such as a pragma Import;

19 * A pragma used to configure the environment by adding, removing, or
replacing library_items.

19.a.1/2 Implementation Advice: Implementation-defined pragmas should not
make an illegal program legal, unless they complete a declaration
or configure the library_items in an environment.

19.a Ramification: For example, it is OK to support Interface,
System_Name, Storage_Unit, and Memory_Size pragmas for upward
compatibility reasons, even though all of these pragmas can make
an illegal program legal. (The latter three can affect legality in
a rather subtle way: They affect the value of named numbers in
System, and can therefore affect the legality in cases where
static expressions are required.)

19.b On the other hand, adding implementation-defined pragmas to a
legal program can make it illegal. For example, a common kind of
implementation-defined pragma is one that promises some property
that allows more efficient code to be generated. If the promise is
a lie, it is best if the user gets an error message.

Incompatibilities With Ada 83

19.c {incompatibilities with Ada 83} In Ada 83, "bad" pragmas are
ignored. In Ada 95, they are illegal, except in the case where the
name of the pragma itself is not recognized by the implementation.

Extensions to Ada 83

19.d {extensions to Ada 83} Implementation-defined pragmas may affect
the legality of a program.

Wording Changes from Ada 83

19.e Implementation-defined pragmas may affect the run-time semantics
of the program. This was always true in Ada 83 (since it was not
explicitly forbidden by RM83), but it was not clear, because there
was no definition of "executing" or "elaborating" a pragma.

Syntax

20 The forms of List, Page, and Optimize pragmas are as follows:

21 pragma List(identifier);

22 pragma Page;

23 pragma Optimize(identifier);

24 [Other pragmas are defined throughout this International Standard, and
are summarized in Annex L.]

24.a Ramification: The language-defined pragmas are supported by every
implementation, although "supporting" some of them (for example,
Inline) requires nothing more than checking the arguments, since
they act only as advice to the implementation.

Static Semantics

25 A pragma List takes one of the identifiers On or Off as the single
argument. This pragma is allowed anywhere a pragma is allowed. It specifies
that listing of the compilation is to be continued or suspended until a List
pragma with the opposite argument is given within the same compilation. The
pragma itself is always listed if the compiler is producing a listing.

26 A pragma Page is allowed anywhere a pragma is allowed. It specifies that
the program text which follows the pragma should start on a new page (if the
compiler is currently producing a listing).

27 A pragma Optimize takes one of the identifiers Time, Space, or Off as the
single argument. This pragma is allowed anywhere a pragma is allowed, and it
applies until the end of the immediately enclosing declarative region, or for
a pragma at the place of a compilation_unit, to the end of the compilation. It
gives advice to the implementation as to whether time or space is the primary
optimization criterion, or that optional optimizations should be turned off.
[It is implementation defined how this advice is followed.]

27.a Implementation defined: Effect of pragma Optimize.

27.b Discussion: For example, a compiler might use Time vs. Space to
control whether generic instantiations are implemented with a
macro-expansion model, versus a shared-generic-body model.

27.c We don't define what constitutes an "optimization" - in fact, it
cannot be formally defined in the context of Ada. One compiler
might call something an optional optimization, whereas another
compiler might consider that same thing to be a normal part of
code generation. Thus, the programmer cannot rely on this pragma
having any particular portable effect on the generated code. Some
compilers might even ignore the pragma altogether.

Examples

28 Examples of pragmas:

29/2 {AI95-00433-01} pragma List(Off); -- turn off listing generation
pragma Optimize(Off); -- turn off optional optimizations
pragma Inline(Set_Mask); -- generate code for Set_Mask inline
pragma Import(C, Put_Char, External_Name => "putchar"); -- import C putchar function

Extensions to Ada 83

29.a {extensions to Ada 83} The Optimize pragma now allows the
identifier Off to request that normal optimization be turned off.

29.b An Optimize pragma may appear anywhere pragmas are allowed.

Wording Changes from Ada 83

29.c We now describe the pragmas Page, List, and Optimize here, to act
as examples, and to remove the normative material from Annex L
, "Language-Defined Pragmas", so it can be entirely an informative
annex.

Wording Changes from Ada 95

29.d/2 {AI95-00433-01} Updated the example of named pragma parameters,
because the second parameter of pragma Suppress is obsolescent.

2.9 Reserved Words

Syntax

1/1 This paragraph was deleted.

2/2 {AI95-00284-02} {AI95-00395-01} {reserved word} The following are the
reserved words. Within a program, some or all of the letters of a
reserved word may be in upper case, and one or more characters in
category other_format may be inserted within or at the end of the
reserved word.

2.a Discussion: Reserved words have special meaning in the syntax. In
addition, certain reserved words are used as attribute names.

2.b The syntactic category identifier no longer allows reserved words.
We have added the few reserved words that are legal explicitly to
the syntax for attribute_reference. Allowing identifier to include
reserved words has been a source of confusion for some users, and
differs from the way they are treated in the C and Pascal language
definitions.

abort
abs
abstract
accept
access
aliased
all
and
array
at

begin
body

case
constant

declare
delay
delta
digits
do

else
elsif
end
entry
exception
exit

for
function

generic
goto

if
in
interface
is

limited
loop

mod

new
not
null

of
or
others
out
overriding

package
pragma
private
procedure
protected

raise
range
record
rem
renames
requeue

return
reverse

select
separate
subtype
synchronized

tagged
task
terminate
then
type

until
use

when
while
with

xor

NOTES

3 8 The reserved words appear in lower case boldface in this
International Standard, except when used in the designator of an
attribute (see 4.1.4). Lower case boldface is also used for a reserved
word in a string_literal used as an operator_symbol. This is merely a
convention - programs may be written in whatever typeface is desired
and available.

Incompatibilities With Ada 83

3.a {incompatibilities with Ada 83} The following words are not
reserved in Ada 83, but are reserved in Ada 95: abstract, aliased,
protected, requeue, tagged, until.

Wording Changes from Ada 83

3.b The clause entitled "Allowed Replacements of Characters" has been
moved to Annex J, "Obsolescent Features".

Incompatibilities With Ada 95

3.c/2 {AI95-00284-02} {incompatibilities with Ada 95} The following
words are not reserved in Ada 95, but are reserved in Ada 2005:
interface, overriding, synchronized. A special allowance is made
for pragma Interface (see J.12). Uses of these words as
identifiers will need to be changed, but we do not expect them to
be common.

Wording Changes from Ada 95

3.d/2 {AI95-00395-01} The definition of upper case equivalence has been
modified to allow identifiers using all of the characters of ISO
10646. This change has no effect on the character sequences that
are reserved words, but does make some unusual sequences of
characters illegal.

Generated by dwww version 1.15 on Sat Jun 15 23:59:01 CEST 2024.