11 ECMAScript Language: Lexical Grammar

InputElementRegExpOrTemplateTail

TemplateSubstitutionTail

InputElementTemplateTail

TemplateSubstitutionTail

11.1 Unicode Format-Control Characters

The Unicode format-control characters (i.e., the characters in category “Cf” in the Unicode Character Database such as LEFT-TO-RIGHT MARK or RIGHT-TO-LEFT MARK) are control codes used to control the formatting of a range of text in the absence of higher-level protocols for this (such as mark-up languages).

It is useful to allow format-control characters in source text to facilitate editing and display. All format control characters may be used within comments, and within string literals, template literals, and regular expression literals.

U+200C (ZERO WIDTH NON-JOINER) and U+200D (ZERO WIDTH JOINER) are format-control characters that are used to make necessary distinctions when forming words or phrases in certain languages. In ECMAScript source text these code points may also be used in an IdentifierName after the first character.

U+FEFF (ZERO WIDTH NO-BREAK SPACE) is a format-control character used primarily at the start of a text to mark it as Unicode and to allow detection of the text's encoding and byte order. <ZWNBSP> characters intended for this purpose can sometimes also appear after the start of a text, for example as a result of concatenating files. In ECMAScript source text <ZWNBSP> code points are treated as white space characters (see 11.2).

The special treatment of certain format-control characters outside of comments, string literals, and regular expression literals is summarized in Table 31.

Code Point	Name	Abbreviation	Usage
`U+200C`	ZERO WIDTH NON-JOINER	<ZWNJ>	IdentifierPart
`U+200D`	ZERO WIDTH JOINER	<ZWJ>	IdentifierPart
`U+FEFF`	ZERO WIDTH NO-BREAK SPACE	<ZWNBSP>	WhiteSpace

11.2 White Space

White space code points are used to improve source text readability and to separate tokens (indivisible lexical units) from each other, but are otherwise insignificant. White space code points may occur between any two tokens and at the start or end of input. White space code points may occur within a StringLiteral, a RegularExpressionLiteral, a Template, or a TemplateSubstitutionTail where they are considered significant code points forming part of a literal value. They may also occur within a Comment, but cannot appear within any other kind of token.

The ECMAScript white space code points are listed in Table 32.

Code Point	Name	Abbreviation
`U+0009`	CHARACTER TABULATION	<TAB>
`U+000B`	LINE TABULATION	<VT>
`U+000C`	FORM FEED (FF)	<FF>
`U+0020`	SPACE	<SP>
`U+00A0`	NO-BREAK SPACE	<NBSP>
`U+FEFF`	ZERO WIDTH NO-BREAK SPACE	<ZWNBSP>
Other category “Zs”	Any other Unicode “Space_Separator” code point	<USP>

ECMAScript implementations must recognize as WhiteSpace code points listed in the “Space_Separator” (“Zs”) category.

Note

Other than for the code points listed in Table 32, ECMAScript WhiteSpace intentionally excludes all code points that have the Unicode “White_Space” property but which are not classified in category “Space_Separator” (“Zs”).

Syntax

WhiteSpace

<TAB>

<VT>

<FF>

<SP>

<NBSP>

<USP>

11.3 Line Terminators

Like white space code points, line terminator code points are used to improve source text readability and to separate tokens (indivisible lexical units) from each other. However, unlike white space code points, line terminators have some influence over the behaviour of the syntactic grammar. In general, line terminators may occur between any two tokens, but there are a few places where they are forbidden by the syntactic grammar. Line terminators also affect the process of automatic semicolon insertion (11.9). A line terminator cannot occur within any token except a StringLiteral, Template, or TemplateSubstitutionTail. <LF> and <CR> line terminators cannot occur within a StringLiteral token except as part of a LineContinuation.

A line terminator can occur within a MultiLineComment but cannot occur within a SingleLineComment.

Line terminators are included in the set of white space code points that are matched by the \s class in regular expressions.

The ECMAScript line terminator code points are listed in Table 33.

Code Point	Unicode Name	Abbreviation
`U+000A`	LINE FEED (LF)	<LF>
`U+000D`	CARRIAGE RETURN (CR)	<CR>
`U+2028`	LINE SEPARATOR	<LS>
`U+2029`	PARAGRAPH SEPARATOR	<PS>

Only the Unicode code points in Table 33 are treated as line terminators. Other new line or line breaking Unicode code points are not treated as line terminators but are treated as white space if they meet the requirements listed in Table 32. The sequence <CR><LF> is commonly used as a line terminator. It should be considered a single SourceCharacter for the purpose of reporting line numbers.

Syntax

LineTerminator

<LF>

<CR>

<LS>

<PS>

LineTerminatorSequence

<LF>

<CR> [lookahead ≠ <LF>]

<LS>

<PS>

11.4 Comments

Comments can be either single or multi-line. Multi-line comments cannot nest.

Because a single-line comment can contain any Unicode code point except a LineTerminator code point, and because of the general rule that a token is always as long as possible, a single-line comment always consists of all code points from the // marker to the end of the line. However, the LineTerminator at the end of the line is not considered to be part of the single-line comment; it is recognized separately by the lexical grammar and becomes part of the stream of input elements for the syntactic grammar. This point is very important, because it implies that the presence or absence of single-line comments does not affect the process of automatic semicolon insertion (see 11.9).

Comments behave like white space and are discarded except that, if a MultiLineComment contains a line terminator code point, then the entire comment is considered to be a LineTerminator for purposes of parsing by the syntactic grammar.

Syntax

opt

MultiLineNotAsteriskChar

MultiLineNotForwardSlashOrAsteriskChar

opt

PostAsteriskCommentChars

opt

PostAsteriskCommentChars

MultiLineNotForwardSlashOrAsteriskChar

opt

PostAsteriskCommentChars

opt

MultiLineNotAsteriskChar

SourceCharacter but not *

SourceCharacter but not one of / or *

SingleLineComment

SingleLineCommentChars

opt

SingleLineCommentChars

SingleLineCommentChar

SingleLineCommentChars

opt

SingleLineCommentChar

SourceCharacter but not LineTerminator

11.5 Tokens

Syntax

Note

The DivPunctuator, RegularExpressionLiteral, RightBracePunctuator, and TemplateSubstitutionTail productions derive additional tokens that are not included in the CommonToken production.

11.6 Names and Keywords

IdentifierName and ReservedWord are tokens that are interpreted according to the Default Identifier Syntax given in Unicode Standard Annex #31, Identifier and Pattern Syntax, with some small modifications. ReservedWord is an enumerated subset of IdentifierName. The syntactic grammar defines Identifier as an IdentifierName that is not a ReservedWord. The Unicode identifier grammar is based on character properties specified by the Unicode Standard. The Unicode code points in the specified categories in the latest version of the Unicode standard must be treated as in those categories by all conforming ECMAScript implementations. ECMAScript implementations may recognize identifier code points defined in later editions of the Unicode Standard.

Note 1

This standard specifies specific code point additions: U+0024 (DOLLAR SIGN) and U+005F (LOW LINE) are permitted anywhere in an IdentifierName, and the code points U+200C (ZERO WIDTH NON-JOINER) and U+200D (ZERO WIDTH JOINER) are permitted anywhere after the first code point of an IdentifierName.

Unicode escape sequences are permitted in an IdentifierName, where they contribute a single Unicode code point to the IdentifierName. The code point is expressed by the CodePoint of the UnicodeEscapeSequence (see 11.8.4). The \ preceding the UnicodeEscapeSequence and the u and { } code units, if they appear, do not contribute code points to the IdentifierName. A UnicodeEscapeSequence cannot be used to put a code point into an IdentifierName that would otherwise be illegal. In other words, if a \ UnicodeEscapeSequence sequence were replaced by the SourceCharacter it contributes, the result must still be a valid IdentifierName that has the exact same sequence of SourceCharacter elements as the original IdentifierName. All interpretations of IdentifierName within this specification are based upon their actual code points regardless of whether or not an escape sequence was used to contribute any particular code point.

Two IdentifierNames that are canonically equivalent according to the Unicode standard are not equal unless, after replacement of each UnicodeEscapeSequence, they are represented by the exact same sequence of code points.

Syntax

UnicodeIDContinue

<ZWNJ>

<ZWJ>

UnicodeIDStart

any Unicode code point with the Unicode property “ID_Start”

UnicodeIDContinue

any Unicode code point with the Unicode property “ID_Continue”

The definitions of the nonterminal UnicodeEscapeSequence is given in 11.8.4.

Note 2

The nonterminal IdentifierPart derives _ via UnicodeIDContinue.

Note 3

The sets of code points with Unicode properties “ID_Start” and “ID_Continue” include, respectively, the code points with Unicode properties “Other_ID_Start” and “Other_ID_Continue”.

11.6.1 Identifier Names

11.6.1.1 Static Semantics: Early Errors

IdentifierStart

It is a Syntax Error if the SV of UnicodeEscapeSequence is none of "$", or "_", or the UTF16Encoding of a code point matched by the UnicodeIDStart lexical grammar production.

OptionalChainingPunctuator

It is a Syntax Error if the SV of UnicodeEscapeSequence is none of "$", or "_", or the UTF16Encoding of either <ZWNJ> or <ZWJ>, or the UTF16Encoding of a Unicode code point that would be matched by the UnicodeIDContinue lexical grammar production.

11.6.1.2 Static Semantics: StringValue

Let idText be the source text matched by IdentifierName.
Let idTextUnescaped be the result of replacing any occurrences of \ UnicodeEscapeSequence in idText with the code point represented by the UnicodeEscapeSequence.
Return ! UTF16Encode(idTextUnescaped).

11.6.2 Keywords and Reserved Words

A keyword is a token that matches IdentifierName, but also has a syntactic use; that is, it appears literally, in a fixed width font, in some syntactic production. The keywords of ECMAScript include if, while, async, await, and many others.

A reserved word is an IdentifierName that cannot be used as an identifier. Many keywords are reserved words, but some are not, and some are reserved only in certain contexts. if and while are reserved words. await is reserved only inside async functions and modules. async is not reserved; it can be used as a variable name or statement label without restriction.

This specification uses a combination of grammatical productions and early error rules to specify which names are valid identifiers and which are reserved words. All tokens in the ReservedWord list below, except for await and yield, are unconditionally reserved. Exceptions for await and yield are specified in 12.1, using parameterized syntactic productions. Lastly, several early error rules restrict the set of valid identifiers. See 12.1.1, 13.3.1.1, 13.7.5.1, and 14.6.1. In summary, there are five categories of identifier names:

Those that are always allowed as identifiers, and are not keywords, such as Math, window, toString, and _;
Those that are never allowed as identifiers, namely the ReservedWords listed below except await and yield;
Those that are contextually allowed as identifiers, namely await and yield;
Those that are contextually disallowed as identifiers, in strict mode code: let, static, implements, interface, package, private, protected, and public;
Those that are always allowed as identifiers, but also appear as keywords within certain syntactic productions, at places where Identifier is not allowed: as, async, from, get, of, set, and target.

The term conditional keyword, or contextual keyword, is sometimes used to refer to the keywords that fall in the last three categories, and thus can be used as identifiers in some contexts and as keywords in others.

Syntax

ReservedWord

one of

await

break

case

catch

class

const

continue

debugger

default

delete

else

enum

export

extends

false

finally

for

function

import

instanceof

new

null

return

super

switch

this

throw

true

try

typeof

var

void

while

with

yield

Note 1

Per 5.1.5, keywords in the grammar match literal sequences of specific SourceCharacter elements. A code point in a keyword cannot be expressed by a \ UnicodeEscapeSequence.

An IdentifierName can contain \ UnicodeEscapeSequences, but it is not possible to declare a variable named "else" by spelling it els\u{65}. The early error rules in 12.1.1 rule out identifiers with the same StringValue as a reserved word.

Note 2

enum is not currently used as a keyword in this specification. It is a future reserved word, set aside for use as a keyword in future language extensions.

Similarly, implements, interface, package, private, protected, and public are future reserved words in strict mode code.

Note 3

The names arguments and eval are not keywords, but they are subject to some restrictions in strict mode code. See 12.1.1, 12.1.3, 14.1.2, 14.4.1, 14.5.1, and 14.7.1.

11.7 Punctuators

Syntax

Punctuator

OtherPunctuator

OptionalChainingPunctuator

[lookahead ∉ DecimalDigit]

OtherPunctuator

one of

{

(

)

[

]

...

;

===

!==

>>>

**=

<<=

>>=

>>>=

DivPunctuator

RightBracePunctuator

}

11.8 Literals

11.8.1 Null Literals

Syntax

NullLiteral

null

11.8.2 Boolean Literals

Syntax

BooleanLiteral

true

false

11.8.3 Numeric Literals

Syntax

DecimalLiteral

opt

DecimalIntegerLiteral

opt

opt

opt

DecimalIntegerLiteral

ExponentPart

opt

DecimalIntegerLiteral

opt

one of

one of

one of

one of

one of

one of

The SourceCharacter immediately following a NumericLiteral must not be an IdentifierStart or DecimalDigit.

Note

For example: 3in is an error and not the two input elements 3 and in.

A conforming implementation, when processing strict mode code, must not extend, as described in B.1.1, the syntax of NumericLiteral to include prod-annexB-LegacyOctalIntegerLiteral, nor extend the syntax of DecimalIntegerLiteral to include prod-annexB-NonOctalDecimalIntegerLiteral.

11.8.3.1 Static Semantics: MV

A numeric literal stands for a value of the Number type or the BigInt type.

The MV of NumericLiteral :: DecimalLiteral is the MV of DecimalLiteral.
The MV of NonDecimalIntegerLiteral :: BinaryIntegerLiteral is the MV of BinaryIntegerLiteral.
The MV of NonDecimalIntegerLiteral :: OctalIntegerLiteral is the MV of OctalIntegerLiteral.
The MV of NonDecimalIntegerLiteral :: HexIntegerLiteral is the MV of HexIntegerLiteral.
The MV of DecimalLiteral :: DecimalIntegerLiteral . is the MV of DecimalIntegerLiteral.
The MV of DecimalLiteral :: DecimalIntegerLiteral . DecimalDigits is the MV of DecimalIntegerLiteral plus (the MV of DecimalDigits × 10_ℝ^-_ℝn), where n is the mathematical value of the number of code points in DecimalDigits.
The MV of DecimalLiteral :: DecimalIntegerLiteral . ExponentPart is the MV of DecimalIntegerLiteral × 10_ℝ^e, where e is the MV of ExponentPart.
The MV of DecimalLiteral :: DecimalIntegerLiteral . DecimalDigits ExponentPart is (the MV of DecimalIntegerLiteral plus (the MV of DecimalDigits × 10_ℝ^-_ℝn)) × 10_ℝ^e, where n is the mathematical integer number of code points in DecimalDigits and e is the MV of ExponentPart.
The MV of DecimalLiteral :: . DecimalDigits is the MV of DecimalDigits × 10_ℝ^-_ℝn, where n is the mathematical integer number of code points in DecimalDigits.
The MV of DecimalLiteral :: . DecimalDigits ExponentPart is the MV of DecimalDigits × 10_ℝ^{e -_ℝ n}, where n is the mathematical integer number of code points in DecimalDigits and e is the MV of ExponentPart.
The MV of DecimalLiteral :: DecimalIntegerLiteral is the MV of DecimalIntegerLiteral.
The MV of DecimalLiteral :: DecimalIntegerLiteral ExponentPart is the MV of DecimalIntegerLiteral × 10_ℝ^e, where e is the MV of ExponentPart.
The MV of DecimalIntegerLiteral :: 0 is 0_ℝ.
The MV of DecimalIntegerLiteral :: NonZeroDigit is the MV of NonZeroDigit.
The MV of DecimalIntegerLiteral :: NonZeroDigit DecimalDigits is (the MV of NonZeroDigit × 10_ℝⁿ) plus the MV of DecimalDigits, where n is the mathematical integer number of code points in DecimalDigits.
The MV of DecimalDigits :: DecimalDigit is the MV of DecimalDigit.
The MV of DecimalDigits :: DecimalDigits DecimalDigit is (the MV of DecimalDigits × 10_ℝ) plus the MV of DecimalDigit.
The MV of ExponentPart :: ExponentIndicator SignedInteger is the MV of SignedInteger.
The MV of SignedInteger :: DecimalDigits is the MV of DecimalDigits.
The MV of SignedInteger :: + DecimalDigits is the MV of DecimalDigits.
The MV of SignedInteger :: - DecimalDigits is the negative of the MV of DecimalDigits.
The MV of DecimalDigit :: 0 or of HexDigit :: 0 or of OctalDigit :: 0 or of BinaryDigit :: 0 is 0_ℝ.
The MV of DecimalDigit :: 1 or of NonZeroDigit :: 1 or of HexDigit :: 1 or of OctalDigit :: 1 or of BinaryDigit :: 1 is 1_ℝ.
The MV of DecimalDigit :: 2 or of NonZeroDigit :: 2 or of HexDigit :: 2 or of OctalDigit :: 2 is 2_ℝ.
The MV of DecimalDigit :: 3 or of NonZeroDigit :: 3 or of HexDigit :: 3 or of OctalDigit :: 3 is 3_ℝ.
The MV of DecimalDigit :: 4 or of NonZeroDigit :: 4 or of HexDigit :: 4 or of OctalDigit :: 4 is 4_ℝ.
The MV of DecimalDigit :: 5 or of NonZeroDigit :: 5 or of HexDigit :: 5 or of OctalDigit :: 5 is 5_ℝ.
The MV of DecimalDigit :: 6 or of NonZeroDigit :: 6 or of HexDigit :: 6 or of OctalDigit :: 6 is 6_ℝ.
The MV of DecimalDigit :: 7 or of NonZeroDigit :: 7 or of HexDigit :: 7 or of OctalDigit :: 7 is 7_ℝ.
The MV of DecimalDigit :: 8 or of NonZeroDigit :: 8 or of HexDigit :: 8 is 8_ℝ.
The MV of DecimalDigit :: 9 or of NonZeroDigit :: 9 or of HexDigit :: 9 is 9_ℝ.
The MV of HexDigit :: a or of HexDigit :: A is 10_ℝ.
The MV of HexDigit :: b or of HexDigit :: B is 11_ℝ.
The MV of HexDigit :: c or of HexDigit :: C is 12_ℝ.
The MV of HexDigit :: d or of HexDigit :: D is 13_ℝ.
The MV of HexDigit :: e or of HexDigit :: E is 14_ℝ.
The MV of HexDigit :: f or of HexDigit :: F is 15_ℝ.
The MV of BinaryIntegerLiteral :: 0b BinaryDigits is the MV of BinaryDigits.
The MV of BinaryIntegerLiteral :: 0B BinaryDigits is the MV of BinaryDigits.
The MV of BinaryDigits :: BinaryDigit is the MV of BinaryDigit.
The MV of BinaryDigits :: BinaryDigits BinaryDigit is (the MV of BinaryDigits × 2_ℝ) plus the MV of BinaryDigit.
The MV of OctalIntegerLiteral :: 0o OctalDigits is the MV of OctalDigits.
The MV of OctalIntegerLiteral :: 0O OctalDigits is the MV of OctalDigits.
The MV of OctalDigits :: OctalDigit is the MV of OctalDigit.
The MV of OctalDigits :: OctalDigits OctalDigit is (the MV of OctalDigits × 8_ℝ) plus the MV of OctalDigit.
The MV of HexIntegerLiteral :: 0x HexDigits is the MV of HexDigits.
The MV of HexIntegerLiteral :: 0X HexDigits is the MV of HexDigits.
The MV of HexDigits :: HexDigit is the MV of HexDigit.
The MV of HexDigits :: HexDigits HexDigit is (the MV of HexDigits × 16_ℝ) plus the MV of HexDigit.
The MV of Hex4Digits :: HexDigit HexDigit HexDigit HexDigit is (0x1000_ℝ times the MV of the first HexDigit) plus (0x100_ℝ times the MV of the second HexDigit) plus (0x10_ℝ times the MV of the third HexDigit) plus the MV of the fourth HexDigit.

11.8.3.2 Static Semantics: NumericValue

DecimalLiteral

Return the Number value that results from rounding the MV of DecimalLiteral as described below.

Return the Number value that results from rounding the MV of NonDecimalIntegerLiteral as described below.

Once the exact MV for a numeric literal has been determined, it is then rounded to a value of the Number type. If the MV is 0_ℝ, then the rounded value is +0; otherwise, the rounded value must be the Number value for the MV (as specified in 6.1.6.1), unless the literal is a DecimalLiteral and the literal has more than 20 significant digits, in which case the Number value may be either the Number value for the MV of a literal produced by replacing each significant digit after the 20th with a 0 digit or the Number value for the MV of a literal produced by replacing each significant digit after the 20th with a 0 digit and then incrementing the literal at the 20th significant digit position. A digit is significant if it is not part of an ExponentPart and

it is not 0; or
there is a nonzero digit to its left and there is a nonzero digit, not in the ExponentPart, to its right.

Return the BigInt value that represents the MV of NonDecimalIntegerLiteral.

Return the BigInt value that represents 0_ℝ.

NonZeroDigit

Return the BigInt value that represents the MV of NonZeroDigit.

NonZeroDigit

DecimalDigits

Let n be the mathematical integer number of code points in DecimalDigits.
Let mv be (the MV of NonZeroDigit × 10_ℝⁿ) plus the MV of DecimalDigits.
Return the BigInt value that represents mv.

11.8.4 String Literals

Note 1

A string literal is zero or more Unicode code points enclosed in single or double quotes. Unicode code points may also be represented by an escape sequence. All code points may appear literally in a string literal except for the closing quote code points, U+005C (REVERSE SOLIDUS), U+000D (CARRIAGE RETURN), and U+000A (LINE FEED). Any code points may appear in the form of an escape sequence. String literals evaluate to ECMAScript String values. When generating these String values Unicode code points are UTF-16 encoded as defined in 10.1.1. Code points belonging to the Basic Multilingual Plane are encoded as a single code unit element of the string. All other code points are encoded as two code unit elements of the string.

Syntax

StringLiteral

opt

opt

DoubleStringCharacter

opt

SingleStringCharacter

opt

DoubleStringCharacter

SourceCharacter but not one of " or \ or LineTerminator

<LS>

<PS>

EscapeSequence

LineContinuation

SingleStringCharacter

SourceCharacter but not one of ' or \ or LineTerminator

<LS>

<PS>

EscapeSequence

LineContinuation

LineTerminatorSequence

EscapeSequence

CharacterEscapeSequence

[lookahead ∉ DecimalDigit]

HexEscapeSequence

A conforming implementation, when processing strict mode code, must not extend the syntax of EscapeSequence to include prod-annexB-LegacyOctalEscapeSequence as described in B.1.2.

CharacterEscapeSequence

SingleEscapeCharacter

NonEscapeCharacter

SingleEscapeCharacter

one of

NonEscapeCharacter

SourceCharacter but not one of EscapeCharacter or LineTerminator

EscapeCharacter

SingleEscapeCharacter

DecimalDigit

HexEscapeSequence

}

The definition of the nonterminal HexDigit is given in 11.8.3. SourceCharacter is defined in 10.1.

Note 2

<LF> and <CR> cannot appear in a string literal, except as part of a LineContinuation to produce the empty code points sequence. The proper way to include either in the String value of a string literal is to use an escape sequence such as \n or \u000A.

11.8.4.1 Static Semantics: StringValue

StringLiteral

opt

opt

Return the String value whose code units are the SV of this StringLiteral.

11.8.4.2 Static Semantics: SV

A string literal stands for a value of the String type. The String value (SV) of the literal is described in terms of code unit values contributed by the various parts of the string literal. As part of this process, some Unicode code points within the string literal are interpreted as having a mathematical value (MV), as described below or in 11.8.3.

The SV of StringLiteral :: " " is the empty code unit sequence.
The SV of StringLiteral :: ' ' is the empty code unit sequence.
The SV of StringLiteral :: " DoubleStringCharacters " is the SV of DoubleStringCharacters.
The SV of StringLiteral :: ' SingleStringCharacters ' is the SV of SingleStringCharacters.
The SV of DoubleStringCharacters :: DoubleStringCharacter is a sequence of up to two code units that is the SV of DoubleStringCharacter.
The SV of DoubleStringCharacters :: DoubleStringCharacter DoubleStringCharacters is a sequence of up to two code units that is the SV of DoubleStringCharacter followed by the code units of the SV of DoubleStringCharacters in order.
The SV of SingleStringCharacters :: SingleStringCharacter is a sequence of up to two code units that is the SV of SingleStringCharacter.
The SV of SingleStringCharacters :: SingleStringCharacter SingleStringCharacters is a sequence of up to two code units that is the SV of SingleStringCharacter followed by the code units of the SV of SingleStringCharacters in order.
The SV of DoubleStringCharacter :: SourceCharacter but not one of " or \ or LineTerminator is the UTF16Encoding of the code point value of SourceCharacter.
The SV of DoubleStringCharacter :: <LS> is the code unit 0x2028 (LINE SEPARATOR).
The SV of DoubleStringCharacter :: <PS> is the code unit 0x2029 (PARAGRAPH SEPARATOR).
The SV of DoubleStringCharacter :: \ EscapeSequence is the SV of EscapeSequence.
The SV of DoubleStringCharacter :: LineContinuation is the empty code unit sequence.
The SV of SingleStringCharacter :: SourceCharacter but not one of ' or \ or LineTerminator is the UTF16Encoding of the code point value of SourceCharacter.
The SV of SingleStringCharacter :: <LS> is the code unit 0x2028 (LINE SEPARATOR).
The SV of SingleStringCharacter :: <PS> is the code unit 0x2029 (PARAGRAPH SEPARATOR).
The SV of SingleStringCharacter :: \ EscapeSequence is the SV of EscapeSequence.
The SV of SingleStringCharacter :: LineContinuation is the empty code unit sequence.
The SV of EscapeSequence :: CharacterEscapeSequence is the SV of CharacterEscapeSequence.
The SV of EscapeSequence :: 0 is the code unit 0x0000 (NULL).
The SV of EscapeSequence :: HexEscapeSequence is the SV of HexEscapeSequence.
The SV of EscapeSequence :: UnicodeEscapeSequence is the SV of UnicodeEscapeSequence.
The SV of CharacterEscapeSequence :: SingleEscapeCharacter is the code unit whose value is determined by the SingleEscapeCharacter according to Table 34.

Escape Sequence	Code Unit Value	Unicode Character Name	Symbol
`\b`	`0x0008`	BACKSPACE	<BS>
`\t`	`0x0009`	CHARACTER TABULATION	<HT>
`\n`	`0x000A`	LINE FEED (LF)	<LF>
`\v`	`0x000B`	LINE TABULATION	<VT>
`\f`	`0x000C`	FORM FEED (FF)	<FF>
`\r`	`0x000D`	CARRIAGE RETURN (CR)	<CR>
`\"`	`0x0022`	QUOTATION MARK	`"`
`\'`	`0x0027`	APOSTROPHE	`'`
`\\`	`0x005C`	REVERSE SOLIDUS	`\`

The SV of CharacterEscapeSequence :: NonEscapeCharacter is the SV of NonEscapeCharacter.
The SV of NonEscapeCharacter :: SourceCharacter but not one of EscapeCharacter or LineTerminator is the UTF16Encoding of the code point value of SourceCharacter.
The SV of HexEscapeSequence :: x HexDigit HexDigit is the code unit whose value is (16_ℝ times the MV of the first HexDigit) plus the MV of the second HexDigit.
The SV of UnicodeEscapeSequence :: u Hex4Digits is the SV of Hex4Digits.
The SV of Hex4Digits :: HexDigit HexDigit HexDigit HexDigit is the code unit whose value is the MV of Hex4Digits.
The SV of UnicodeEscapeSequence :: u{ CodePoint } is the UTF16Encoding of the MV of CodePoint.

11.8.5 Regular Expression Literals

Note 1

A regular expression literal is an input element that is converted to a RegExp object (see 21.2) each time the literal is evaluated. Two regular expression literals in a program evaluate to regular expression objects that never compare as === to each other even if the two literals' contents are identical. A RegExp object may also be created at runtime by new RegExp or calling the RegExp constructor as a function (see 21.2.3).

The productions below describe the syntax for a regular expression literal and are used by the input element scanner to find the end of the regular expression literal. The source text comprising the RegularExpressionBody and the RegularExpressionFlags are subsequently parsed again using the more stringent ECMAScript Regular Expression grammar (21.2.1).

An implementation may extend the ECMAScript Regular Expression grammar defined in 21.2.1, but it must not extend the RegularExpressionBody and RegularExpressionFlags productions defined below or the productions used by these productions.

Syntax

RegularExpressionFirstChar

RegularExpressionChars

[empty]

RegularExpressionChars

RegularExpressionChar

RegularExpressionFirstChar

RegularExpressionNonTerminator but not one of * or \ or / or [

RegularExpressionClass

RegularExpressionChar

RegularExpressionNonTerminator but not one of \ or / or [

RegularExpressionClass

RegularExpressionNonTerminator

SourceCharacter but not LineTerminator

RegularExpressionClass

[

RegularExpressionClassChars

]

RegularExpressionClassChars

[empty]

RegularExpressionClassChars

RegularExpressionClassChar

RegularExpressionNonTerminator but not one of ] or \

[empty]

Note 2

Regular expression literals may not be empty; instead of representing an empty regular expression literal, the code unit sequence // starts a single-line comment. To specify an empty regular expression, use: /(?:)/.

11.8.5.1 Static Semantics: Early Errors

It is a Syntax Error if IdentifierPart contains a Unicode escape sequence.

11.8.5.2 Static Semantics: BodyText

Return the source text that was recognized as RegularExpressionBody.

11.8.5.3 Static Semantics: FlagText

Return the source text that was recognized as RegularExpressionFlags.

11.8.6 Template Literal Lexical Components

Syntax

Template

NoSubstitutionTemplate

TemplateHead

NoSubstitutionTemplate

TemplateCharacters

opt

TemplateHead

TemplateCharacters

opt

TemplateSubstitutionTail

}

opt

}

opt

opt

[lookahead ≠ {]

LineTerminatorSequence

SourceCharacter but not one of ` or \ or $ or LineTerminator

NotEscapeSequence

DecimalDigit

DecimalDigit but not 0

[lookahead ∉ HexDigit]

[lookahead ∉ HexDigit]

[lookahead ≠ {]

[lookahead ∉ HexDigit]

[lookahead ∉ HexDigit]