5 Notational Conventions
5.1 Syntactic and Lexical Grammars
5.1.1 Context-Free Grammars
A context-free grammar consists of a number of productions. Each production has an abstract symbol called a nonterminal as its left-hand side, and a sequence of zero or more nonterminal and terminal symbols as its right-hand side. For each grammar, the terminal symbols are drawn from a specified alphabet.
A chain production is a production that has exactly one nonterminal symbol on its right-hand side along with zero or more terminal symbols.
Starting from a sentence consisting of a single distinguished nonterminal, called the goal symbol, a given context-free grammar specifies a language, namely, the (perhaps infinite) set of possible sequences of terminal symbols that can result from repeatedly replacing any nonterminal in the sequence with a right-hand side of a production for which the nonterminal is the left-hand side.
5.1.2 The Lexical and RegExp Grammars
A lexical grammar for ECMAScript is given in clause
Input elements other than white space and comments form the terminal symbols for the syntactic grammar for ECMAScript and are called ECMAScript tokens. These tokens are the /*…*/ regardless of whether it spans more than one line) is likewise simply discarded if it contains no line terminator; but if a
A RegExp grammar for ECMAScript is given in
Productions of the lexical and RegExp grammars are distinguished by having two colons “::” as separating punctuation. The lexical and RegExp grammars share some productions.
5.1.3 The Numeric String Grammar
A numeric string grammar appears in
Productions of the numeric string grammar are distinguished by having three colons “:::” as punctuation, and are never used for parsing source text.
5.1.4 The Syntactic Grammar
The syntactic grammar for ECMAScript is given in clauses
When a stream of code points is to be parsed as an ECMAScript
When a parse is successful, it constructs a parse tree, a rooted tree structure in which each node is a Parse Node. Each Parse Node is an instance of a symbol in the grammar; it represents a span of the source text that can be derived from that symbol. The root node of the parse tree, representing the whole of the source text, is an instance of the parse's
New Parse Nodes are instantiated for each invocation of the parser and never reused between parses even of identical source text. Parse Nodes are considered the same Parse Node if and only if they represent the same span of source text, are instances of the same grammar symbol, and resulted from the same parser invocation.
Parsing the same String multiple times will lead to different Parse Nodes. For example, consider:
let str = "1 + 1;";
eval(str);
eval(str);
Each call to eval converts the value of str into
Productions of the syntactic grammar are distinguished by having just one colon “:” as punctuation.
The syntactic grammar as presented in clauses
In certain cases, in order to avoid ambiguities, the syntactic grammar uses generalized productions that permit token sequences that do not form a valid ECMAScript
- The sequence of tokens originally matched by P is parsed again using N as the
goal symbol . If N takes grammatical parameters, then they are set to the same values used when P was originally parsed. - If the sequence of tokens can be parsed as a single instance of N, with no tokens left over, then:
- We refer to that instance of N (a Parse Node, unique for a given P) as "the N that is covered by P".
- All Early Error rules for N and its derived productions also apply to the N that is covered by P.
- Otherwise (if the parse fails), it is an early Syntax Error.
5.1.5 Grammar Notation
5.1.5.1 Terminal Symbols
In the ECMAScript grammars, some terminal symbols are shown in fixed-width font. These are to appear in a source text exactly as written. All terminal symbol code points specified in this way are to be understood as the appropriate Unicode code points from the Basic Latin block, as opposed to any similar-looking code points from other Unicode ranges. A code point in a terminal symbol cannot be expressed by a \
In grammars whose terminal symbols are individual Unicode code points (i.e., the lexical, RegExp, and numeric string grammars), a contiguous run of multiple fixed-width code points appearing in a production is a simple shorthand for the same sequence of code points, written as standalone terminal symbols.
For example, the production:
is a shorthand for:
In contrast, in the syntactic grammar, a contiguous run of fixed-width code points is a single terminal symbol.
Terminal symbols come in two other forms:
- In the lexical and RegExp grammars, Unicode code points without a conventional printed representation are instead shown in the form "<ABBREV>" where "ABBREV" is a mnemonic for the code point or set of code points. These forms are defined in
Unicode Format-Control Characters ,White Space , andLine Terminators . - In the syntactic grammar, certain terminal symbols (e.g.
IdentifierName andRegularExpressionLiteral ) are shown in italics, as they refer to the nonterminals of the same name in the lexical grammar.
5.1.5.2 Nonterminal Symbols and Productions
Nonterminal symbols are shown in italic type. The definition of a nonterminal (also called a “production”) is introduced by the name of the nonterminal being defined followed by one or more colons. (The number of colons indicates to which grammar the production belongs.) One or more alternative right-hand sides for the nonterminal then follow on succeeding lines. For example, the syntactic definition:
states that the nonterminal while, followed by a left parenthesis token, followed by an
states that an
5.1.5.3 Optional Symbols
The subscripted suffix “opt”, which may appear after a terminal or nonterminal, indicates an optional symbol. The alternative containing the optional symbol actually specifies two right-hand sides, one that omits the optional element and one that includes it. This means that:
is a convenient abbreviation for:
and that:
is a convenient abbreviation for:
which in turn is an abbreviation for:
so, in this example, the nonterminal
5.1.5.4 Grammatical Parameters
A production may be parameterized by a subscripted annotation of the form “[parameters]”, which may appear as a suffix to the nonterminal symbol defined by the production. “parameters” may be either a single name or a comma separated list of names. A parameterized production is shorthand for a set of productions defining all combinations of the parameter names, preceded by an underscore, appended to the parameterized nonterminal symbol. This means that:
is a convenient abbreviation for:
and that:
is an abbreviation for:
Multiple parameters produce a combinatoric number of productions, not all of which are necessarily referenced in a complete grammar.
References to nonterminals on the right-hand side of a production can also be parameterized. For example:
is equivalent to saying:
and:
is equivalent to:
A nonterminal reference may have both a parameter list and an “opt” suffix. For example:
is an abbreviation for:
Prefixing a parameter name with “?” on a right-hand side nonterminal reference makes that parameter value dependent upon the occurrence of the parameter name on the reference to the current production's left-hand side symbol. For example:
is an abbreviation for:
If a right-hand side alternative is prefixed with “[+parameter]” that alternative is only available if the named parameter was used in referencing the production's nonterminal symbol. If a right-hand side alternative is prefixed with “[~parameter]” that alternative is only available if the named parameter was not used in referencing the production's nonterminal symbol. This means that:
is an abbreviation for:
and that:
is an abbreviation for:
5.1.5.5 one of
When the words “one of” follow the colon(s) in a grammar definition, they signify that each of the terminal symbols on the following line or lines is an alternative definition. For example, the lexical grammar for ECMAScript contains the production:
which is merely a convenient abbreviation for:
5.1.5.6 [empty]
If the phrase “[empty]” appears as the right-hand side of a production, it indicates that the production's right-hand side contains no terminals or nonterminals.
5.1.5.7 Lookahead Restrictions
If the phrase “[lookahead = seq]” appears in the right-hand side of a production, it indicates that the production may only be used if the token sequence seq is a prefix of the immediately following input token sequence. Similarly, “[lookahead ∈ set]”, where set is a
These conditions may be negated. “[lookahead ≠ seq]” indicates that the containing production may only be used if seq is not a prefix of the immediately following input token sequence, and “[lookahead ∉ set]” indicates that the production may only be used if no element of set is a prefix of the immediately following token sequence.
As an example, given the definitions:
the definition:
matches either the letter n followed by one or more decimal digits the first of which is even, or a decimal digit not followed by another decimal digit.
Note that when these phrases are used in the syntactic grammar, it may not be possible to unambiguously identify the immediately following token sequence because determining later tokens requires knowing which lexical
5.1.5.8 [no LineTerminator here]
If the phrase “[no
indicates that the production may not be used if a throw token and the
Unless the presence of a
5.1.5.9 but not
The right-hand side of a production may specify that certain expansions are not permitted by using the phrase “but not” and then indicating the expansions to be excluded. For example, the production:
means that the nonterminal
5.1.5.10 Descriptive Phrases
Finally, a few nonterminal symbols are described by a descriptive phrase in sans-serif type in cases where it would be impractical to list all the alternatives:
5.2 Algorithm Conventions
The specification often uses a numbered list to specify steps in an algorithm. These algorithms are used to precisely specify the required semantics of ECMAScript language constructs. The algorithms are not intended to imply the use of any specific implementation technique. In practice, there may be more efficient algorithms available to implement a given feature.
Algorithms may be explicitly parameterized with an ordered, comma-separated sequence of alias names which may be used within the algorithm steps to reference the argument passed in that position. Optional parameters are denoted with surrounding brackets ([ , name ]) and are no different from required parameters within algorithm steps. A rest parameter may appear at the end of a parameter list, denoted with leading ellipsis (, ...name). The rest parameter captures all of the arguments provided following the required and optional parameters into a
Algorithm steps may be subdivided into sequential substeps. Substeps are indented and may themselves be further divided into indented substeps. Outline numbering conventions are used to identify substeps with the first level of substeps labelled with lowercase alphabetic characters and the second level of substeps labelled with lowercase roman numerals. If more than three levels are required these rules repeat with the fourth level using numeric labels. For example:
- Top-level step
- Substep.
- Substep.
- Subsubstep.
- Subsubsubstep
- Subsubsubsubstep
- Subsubsubsubsubstep
- Subsubsubsubstep
- Subsubsubstep
- Subsubstep.
A step or substep may be written as an “if” predicate that conditions its substeps. In this case, the substeps are only applied if the predicate is true. If a step or substep begins with the word “else”, it is a predicate that is the negation of the preceding “if” predicate step at the same level.
A step may specify the iterative application of its substeps.
A step that begins with “Assert:” asserts an invariant condition of its algorithm. Such assertions are used to make explicit algorithmic invariants that would otherwise be implicit. Such assertions add no additional semantic requirements and hence need not be checked by an implementation. They are used simply to clarify algorithms.
Algorithm steps may declare named aliases for any value using the form “Let x be someValue”. These aliases are reference-like in that both x and someValue refer to the same underlying data and modifications to either are visible to both. Algorithm steps that want to avoid this reference-like behaviour should explicitly make a copy of the right-hand side: “Let x be a copy of someValue” creates a shallow copy of someValue.
Once declared, an alias may be referenced in any subsequent steps and must not be referenced from steps prior to the alias's declaration. Aliases may be modified using the form “Set x to someOtherValue”.
5.2.1 Abstract Operations
In order to facilitate their use in multiple parts of this specification, some algorithms, called abstract operations, are named and written in parameterized functional form so that they may be referenced by name from within other algorithms. Abstract operations are typically referenced using a functional application style such as OperationName(arg1, arg2). Some abstract operations are treated as polymorphically dispatched methods of class-like specification abstractions. Such method-like abstract operations are typically referenced using a method application style such as someValue.OperationName(arg1, arg2).
5.2.2 Syntax-Directed Operations
A syntax-directed operation is a named operation whose definition consists of algorithms, each of which is associated with one or more productions from one of the ECMAScript grammars. A production that has multiple alternative definitions will typically have a distinct algorithm for each alternative. When an algorithm is associated with a grammar production, it may reference the terminal and nonterminal symbols of the production alternative as if they were parameters of the algorithm. When used in this manner, nonterminal symbols refer to the actual alternative definition that is matched when parsing the source text. The source text matched by a grammar production or
When an algorithm is associated with a production alternative, the alternative is typically shown without any “[ ]” grammar annotations. Such annotations should only affect the syntactic recognition of the alternative and have no effect on the associated semantics for the alternative.
Syntax-directed operations are invoked with a parse node and, optionally, other parameters by using the conventions on steps
- Let status be SyntaxDirectedOperation of
SomeNonTerminal . - Let someParseNode be the parse of some source text.
- Perform SyntaxDirectedOperation of someParseNode.
- Perform SyntaxDirectedOperation of someParseNode with argument
"value" .
Unless explicitly specified otherwise, all
but the
Runtime Semantics:
- Return
Evaluation ofStatementList .
5.2.3 Runtime Semantics
Algorithms which specify semantics that must be called at runtime are called runtime semantics. Runtime semantics are defined by
5.2.3.1 Completion ( completionRecord )
The abstract operation Completion takes argument completionRecord (a
Assert : completionRecord is aCompletion Record .- Return completionRecord.
5.2.3.2 Throw an Exception
Algorithms steps that say to throw an exception, such as
- Throw a
TypeError exception.
mean the same things as:
- Return
ThrowCompletion (a newly createdTypeError object).
5.2.3.3 ReturnIfAbrupt
Algorithms steps that say or are otherwise equivalent to:
- ReturnIfAbrupt(argument).
mean the same thing as:
Assert : argument is aCompletion Record .- If argument is an
abrupt completion , returnCompletion (argument). - Else, set argument to argument.[[Value]].
Algorithms steps that say or are otherwise equivalent to:
- ReturnIfAbrupt(AbstractOperation()).
mean the same thing as:
- Let hygienicTemp be AbstractOperation().
Assert : hygienicTemp is aCompletion Record .- If hygienicTemp is an
abrupt completion , returnCompletion (hygienicTemp). - Else, set hygienicTemp to hygienicTemp.[[Value]].
Where hygienicTemp is ephemeral and visible only in the steps pertaining to ReturnIfAbrupt.
Algorithms steps that say or are otherwise equivalent to:
- Let result be AbstractOperation(ReturnIfAbrupt(argument)).
mean the same thing as:
Assert : argument is aCompletion Record .- If argument is an
abrupt completion , returnCompletion (argument). - Else, set argument to argument.[[Value]].
- Let result be AbstractOperation(argument).
5.2.3.4 ReturnIfAbrupt Shorthands
Invocations of ? indicate that
- ? OperationName().
is equivalent to the following step:
ReturnIfAbrupt (OperationName()).
Similarly, for method application style, the step:
- ? someValue.OperationName().
is equivalent to:
ReturnIfAbrupt (someValue.OperationName()).
Similarly, prefix ! is used to indicate that the following invocation of an abstract or
- Let val be ! OperationName().
is equivalent to the following steps:
- Let val be OperationName().
Assert : val is anormal completion .- Set val to val.[[Value]].
! or ? before the invocation of the operation:
- Perform ! SyntaxDirectedOperation of
NonTerminal .
5.2.3.5 Implicit Normal Completion
In algorithms within
- when the result of applying
Completion ,NormalCompletion ,ThrowCompletion , orReturnCompletion is directly returned - when the result of constructing a
Completion Record is directly returned
It is an editorial error if a
- Return
true .
means the same things as any of
- Return
NormalCompletion (true ).
or
- Let completion be
NormalCompletion (true ). - Return
Completion (completion).
or
- Return
Completion Record { [[Type]]:normal , [[Value]]:true , [[Target]]:empty }.
Note that, through the
- Return ? completion.
The following example would be an editorial error because a
- Let completion be
NormalCompletion (true ). - Return completion.
5.2.4 Static Semantics
Context-free grammars are not sufficiently powerful to express all the rules that define whether a stream of input elements form a valid ECMAScript
Static Semantic Rules have names and typically are defined using an algorithm. Named Static Semantic Rules are associated with grammar productions and a production that has multiple alternative definitions will typically have for each alternative a distinct algorithm for each applicable named static semantic rule.
A special kind of static semantic rule is an Early Error Rule.
5.2.5 Mathematical Operations
This specification makes reference to these kinds of numeric values:
- Mathematical values: Arbitrary real numbers, used as the default numeric type.
- Extended mathematical values:
Mathematical values together with +∞ and -∞. - Numbers:
IEEE 754-2019 binary64 (double-precision floating point) values. - BigInts:
ECMAScript language values representing arbitraryintegers in a one-to-one correspondence.
In the language of this specification, numerical values are distinguished among different numeric kinds using subscript suffixes. The subscript 𝔽 refers to Numbers, and the subscript ℤ refers to BigInts. Numeric values without a subscript suffix refer to
In general, when this specification refers to a numerical value, such as in the phrase, "the length of y" or "the
When the term integer is used in this specification, it refers to a
Numeric operators such as +, ×, =, and ≥ refer to those operations as determined by the type of the operands. When applied to
Conversions between
The mathematical function
The mathematical function
The notation “
The phrase "the result of clamping x between lower and upper" (where x is an
The mathematical function
The mathematical function
Mathematical functions
An interval from lower bound a to upper bound b is a possibly-infinite, possibly-empty set of numeric values of the same numeric type. Each bound will be described as either inclusive or exclusive, but not both. There are four kinds of intervals, as follows:
- An
interval from a (inclusive) to b (inclusive), also called an inclusive interval from a to b, includes all values x of the same numeric type such that a ≤ x ≤ b, and no others. - An
interval from a (inclusive) to b (exclusive) includes all values x of the same numeric type such that a ≤ x < b, and no others. - An
interval from a (exclusive) to b (inclusive) includes all values x of the same numeric type such that a < x ≤ b, and no others. - An
interval from a (exclusive) to b (exclusive) includes all values x of the same numeric type such that a < x < b, and no others.
For example, the
5.2.6 Value Notation
In this specification, Function.prototype.apply or let n = 42;.
5.2.7 Identity
In this specification, both specification values and
From the perspective of this specification, the word “is” is used to compare two values for equality, as in “If bool is
From the perspective of the ECMAScript language, language values are compared for equality using the
For specification values, examples of values without specification identity include, but are not limited to:
Specification identity agrees with language identity for all