Lasse Reichstein Nielsen wrote:
> Asen Bozhilov <> writes:
>> Documentation permit to be used `\UnicodeEscapeSequence` in
>> IdentifierName. But there:
>>
>> | Unicode escape sequences are also permitted in identifiers,
>> | where they contribute a single character to the
>> | identifier, as computed by the CV of the
>> | UnicodeEscapeSequence.
>
> This is the important part. It allows unicode escapes in identifiers.
But none that would not be allowed if the character was included verbatim.
> There is no similar statement for any of the reserved words, so
> unicode escapes cannot be used in a keyword.
You have got it backwards.
>> | The \ preceding the UnicodeEscapeSequence does not contribute a
>> | character to the identifier. A UnicodeEscapeSequence cannot be
>> | used to put a character into an identifier that would otherwise be
>> | illegal. In other words, if a \UnicodeEscapeSequence sequence were
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> | replaced by its UnicodeEscapeSequence's CV, the result must still
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^
>> | be a valid Identifier that has the exact same sequence of characters
^^^^^^^^^^^^^^^^^^^^^
>> | as the original Identifier.
I do not think it can be worded more clearly.
>> As i understand it. If i type:
>>
>> var \\u0069\\u0066; //var if;
>
> (I assume it should be single backslashes when not in a string
Why, the double backslashes are legal, too. However the resulting value
would still not be an /Identifier/, barring language extensions.
>> `if` is ReservedWord and example above, should throw SyntaxError.
>
> No.
True, but the program ought to be syntactical in error nonetheless.
> While 'if' is a keyword, it is only the sequence U+0069 U+0066
> that is recognized as the 'if' keyword. Unicode escapes are not allowed
> as parts of keywords. The above, correctly, declares a variable called
> 'if' - because "\u0069\u0066" matches the production of an identifier
> and it doesn't match the production of any reserved word.
> The inputs, "if" and "i\u0066" are different sequences of characters.
> They are parsed differently. The latter is parsed as an identifier.
Your logic is flawed, because escape sequences are converted into the
corresponding Unicode characters (the character is the Computed Value)
*before* the tokenization process takes place that follows from applying
the syntactical grammar:
| 5.1.4
|
| [...]
| When a stream of characters is to be parsed as an ECMAScript program, it
| is first converted to a stream of input elements by repeated application
| of the lexical grammar; this stream of input elements is then parsed by
| a single application of the syntactic grammar. The program is
| syntactically in error if the tokens in the stream of input elements
| cannot be parsed as a single instance of the goal nonterminal /Program/,
| with no tokens left over.
/UnicodeEscapeSequence/ is a goal symbol of the lexical grammar as is
/Keyword/; /IfStatement/ is a goal symbol of the syntactic grammar.
As a result, first application of the lexical grammar ought to cause
var \u0069\u0066
to become
var if
and second application of the lexical grammar ought to cause `if' to be
parsed as as a /Keyword/:
| Keyword :: one of
| [...] if [...]
Then, application of the syntactic grammar ought to cause
var if
to be recognized as theoretically producible by
VariableStatement :
VariableDeclarationList
VariableDeclarationList :
VariableDeclaration
VariableDeclaration :
Identifier Initialiser_opt
which ought to fail because the token `if' has been determined a /Keyword/
before, not an /Identifier/, and no other productions of the syntactic
grammar would be applicable.
Therefore, the program ought to be considered syntactically in error. That
it might not, could only be attributed to a proprietary extension. Hence
the clarification as quoted above:
| A UnicodeEscapeSequence cannot be used to put a character into an
| identifier that would otherwise be illegal. [...]
PointedEars
--
Danny Goodman's books are out of date and teach practices that are
positively harmful for cross-browser scripting.
-- Richard Cornford, cljs, <cife6q$253$1$> (2004)