[whatwg] Unsafe SGML minimizations

Henri Sivonen hsivonen at iki.fi
Thu Sep 8 08:19:11 PDT 2005

On Sep 8, 2005, at 17:26, Ian Hickson wrote:

> On Thu, 8 Sep 2005, Henri Sivonen wrote:
>> I think the text/html flavor of HTML5 should not allow the following 
>> minimization features (which are theoretically allowed in HTML 4), 
>> because
>> each of them causes problems in at least one of Opera, Firefox and 
>> Safari.
>>  * <>
>>  * </>
> Agreed. Those should generate comment nodes, I think.

Opera, Firefox and Safari already interoperably handle <> as character 
data (equivalent to <>) and ignore </>.

>>  * tagc omission ie. <foo<bar>...</bar</foo>
> Well we have to define what that does, and the most obvious error 
> handling
> behaviour here is to start the new tag. So effectively, I would say we
> shoul have TAGC omission.

But it would still be an error as far as a conformance checker is 
concerned, right?

>>  * <foo/bar/
> Agreed, sadly. That would be equivalent to something like <foo 
> /bar/="">
> (or something similar).

I think the HTML5 spec should allow TagSoup to be updated for HTML5 or 
an equivalent of TagSoup for HTML5 to be written. TagSoup guarantees to 
the application that it acts as if it was an XML parser parsing XHTML. 
Therefore, XML and, by extension, the SAX2 API contract restrict the 
attribute names to legal XML attribute names. If HTML5 required "/bar/" 
to be reported as an attribute name, TagSoup would have to violate that 
constraint and could not claim conformance.

>>  * attribute name omission (except for the well-known "boolean 
>> attributes")
> Again, we have to define error handling. <foo bar baz> will probably 
> just
> be equivalent to <foo bar="" baz="">.

I have previously argued for <foo bar="bar" baz="baz"> in the 
TagSoup-like scenario, because that would be the same as the treatment 
required for the "boolean attributes".

Henri Sivonen
hsivonen at iki.fi

More information about the whatwg mailing list