[whatwg] Comment Syntax and Parsing

Ian Hickson ian at hixie.ch
Mon Jan 23 13:39:50 PST 2006

On Mon, 23 Jan 2006, Lachlan Hunt wrote:
> Well that depends on the implementation and how SGML defines that such 
> erroneous comments be handled.

Indeed, there is that too. Whatever behaviour we require will be, to some 
extent, new behaviour.

> (Without a copy of IS0O-8879 handy, it's difficult to check, so the 
> following is based purely on observing the implementations.)

ISO 8879:1986 (including its 1996 and 1998 annexes) doesn't cover, as far 
as I can tell, error handling requirements for parsers.

> Do you know if browsers will be using this for both standards and quirks 
> mode or will they retain their existing quirks mode parsing and use this 
> as the new standards mode parsing only?

I imagine that any changes to quirks mode handling will be done very 
carefully over an extended period of time.

> Well, many authors believe their using XHTML, and many even believe they 
> using the correct XHTML MIME Type (using <meta>), even though they're 
> not.  So, regardless of whether they actually are or not, they're going 
> to believe they are and it's best not to confuse them more by saying:
>    "<!--------> isn't well-formed XML"

Fair enough. I've made it a parse error (which is what determines what 
conformance checkers must say regarding valid vs invalid syntax).

> ...have them come back and say:
>    "the validator says it's fine"
> and then tell them:
>   "that's because the document isn't XHTML".
> only to hear:
>   "Yes it is, look at the meta element and all these slashes (<br/>)"

<br/> will also be flagged as a parse error, for what it's worth.

On Mon, 23 Jan 2006, Henri Sivonen wrote:
> [...]

By the way, Henri, thanks for your comments a few months back about 
parsing. I've been using them, and have agreed and implemented most of 
them in the spec so far. I'll reply to them in more detail in due course.

> I think allowing paired double hyphens with whitespace in between [would 
> make sense]

That seems like excessive complexity for conformance checkers, with very 
little benefit (beyond the theoretical).

> and allowing whitespace between the ending "--" and ">" would make 
> sense.

This also seems a little gratuitous.

> This would improve the source-level upgradeability of valid HTML 4 to 
> conforming HTML 5. However, it would have the old confusion issues.

I think those issues outweigh the benefits you mention.

> I guess the XML style is the simplest thing that could work. :-/

I agree. :-)

Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

More information about the whatwg mailing list