[whatwg] text/html flavor conformance checkers and <foo />

R.J.Koppes rikkert at rikkertkoppes.com
Thu Apr 28 00:27:50 PDT 2005


I'd say the "/" in <foo /> should be treated as an invalid character by
conformance checkers, I guess something like <foo ?> is treated that way
too? If not it should. So it might raise an error reporting an illegal
character and it might raise another error in a further stage if the </foo>
closing tag is mandatory (in the case of <script> for instance)

Alternatively, if one allows "/" (and "?") characters in attributes (is
there a passage on that anyway? since HTML only allows predefined
attributes, it should not be nescesary anyway)), than <foo /> and <foo ?>
should raise an error reporting an invalid attribute, another error
reporting a missing attribute value and possibly raising an third error
reporting a missing closing tag.

UA's however should either treat the "/" as invalid character and discard it
(preferred error checking, say) or apply SGML rules and treat the trailing
">" as redundant character (SGML based error checking). Either way <foo />
is treated as <foo> and the DOM tree should be built as if that were the
case. I am not sure which rule UA's are applying at the moment

Rikkert Koppes
www.rikkertkoppes.com


----- Original Message -----
From: "Henri Sivonen" <hsivonen at iki.fi>
To: "WHAT WG List" <whatwg at whatwg.org>
Sent: Wednesday, April 27, 2005 11:42 AM
Subject: Re: [whatwg] text/html flavor conformance checkers and <foo />


> On Apr 27, 2005, at 04:13, fantasai wrote:
>
> > Henri Sivonen wrote:
> >> On Apr 26, 2005, at 19:08, fantasai wrote:
> >>> Henri Sivonen wrote:
> >>>
> >>>> What do you suggest the parser layer of an text/html conformance
> >>>> checker say about <input checkbox ...>?
> >>>> 1. Silently treat as <input type="checkbox" ...>?
> >>>> 2. Treat as <input type="checkbox" ...> but warn?
> >>>> 3. Treat as <input checkbox="checkbox" ...> causing an error to be
> >>>> reported on a higher layer?
> >>>> 4. Treat as fatal error in the parser?
> >>>> I'm inclined to choose 3.
> >>>
> >>>
> >>> *Why?* Why of all things would you choose to interpret it like
> >>> /that/?
> >>> It's neither reporting a useful error, nor handling it per SGML
> >>> rules.
> >> To make the separation of concerns similar to what it would be on the
> >> XML side while being real about SGMLness being fiction. That is, the
> >> parser does not need to know if an attribute is allowed. That's a job
> >> for a higher layer.
> >
> > I still don't understand how this interpretation is useful or required.
>
> It is useful, because it doesn't require knowledge of allowable
> minimizable attributes on the lowest parser level.
>
> > If you want to make <input checkbox> invalid, handle it the same way
> > you'd handle <input foo>.
>
> That's what I am suggesting. The parser would treat <input foo> as
> <input foo="foo">, which would be caught on the RELAX NG validation
> layer in my diagram.
>
> > Expanding the attribute from checked to checked="checked" is neither
> > conforming to SGML parsing rules
>
> ITYM checkbox to checkbox="checkbox".
>
> > nor helping the author understand what was wrong.
>
> Would "Attribute 'checkbox' not allowed here." or something along those
> lines be any more incomprehensible that validation errors in general?
>
> > I mean, I understand you're disillusioned with the state of HTML
> > parsing in the world, but it doesn't mean you need to be /reactionary/
> > about it.
>
> Authors get constantly confused when validator.w3.org feeds them SGML
> fiction. Why shouldn't the QA tools be better aligned with reality?
>
> --
> Henri Sivonen
> hsivonen at iki.fi
> http://hsivonen.iki.fi/
>
>




More information about the whatwg mailing list