[whatwg] Ambiguous ampersand
ian at hixie.ch
Mon Sep 14 18:52:58 PDT 2009
On Tue, 8 Sep 2009, Øistein E. Andersen wrote:
> According to § 9.1.4 Character references, "An ambiguous ampersand is a
> U+0026 AMPERSAND (&) character that is followed by some text other than
> a space character, a U+003C LESS-THAN SIGN character ('<'), or another
> U+0026 AMPERSAND (&) character", text being "allowed inside elements,
> attributes, and comments" (§ 9.1.3 Text). (Should that be "attribute
> values"? Either is probably acceptable.)
> This text does not seem to define the ampersand in <element attr=&> as
> ambiguous, but it still causes a parse error. <element attr=& attr2>,
> <element attr="&"> and <element attr='&'> are all conforming, so the
> most consistent solution would probably be to remove the parse error by
> setting the "additional allowed character" to '>' when encountering an
> ampersand in the "Attribute value (unquoted)" state.
> Also, making the sequence "&<" conforming in (quoted) attribute values,
> where the '<' occurs as text, seems inconsistent.
If we made &< non-conforming everywhere, then to detect this case would be
ridiculously complicated in <title> elements:
<title> test &< test &<!-- test &< &</title> --> &</foo> &</title>
Which are compliant and which are not?
Making &< conforming everywhere that the < is conforming text is more
consistent than making &< only conforming in RCDATA text.
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg