[whatwg] [html5] tags, elements and generated DOM
Ian Hickson
ian at hixie.ch
Fri Feb 24 14:57:33 PST 2006
On Wed, 6 Apr 2005, Olav Junker Kjær wrote:
>
> An innocent question (no flamewar intended): What is the benefit of
> having HTML defined as an application of SGML ?
You could use SGML tools with it, including well-established validator
tools; the parsing model (for compliant documents) is very clear; SGML has
a lot of abbreviation syntaxes that make it quick to write markup, it
means we're not reinventing the wheel.
Unfortunately, in practice, nobody uses SGML tools, validators are unable
to catch a number of important (computer-checkable) conformance problems,
the parsing model doesn't handle non-compliant documents and the majority
of documents are non-compliant, the abbreviation syntaxes are extremely
complicated and largely unimplemented, and incompatible with existing
content, and the wheel was already reinvented.
On Wed, 6 Apr 2005, Olav Junker Kjær wrote:
>
> The problem is that validators use the term "valid" in a very limited
> sense, but web authors without a through understanding of DTD-validation
> would naturally assume that "valid" would mean "valid according to the
> spec".
Indeed; the term "valid" in an XML/SGML context is used to mean a specific
subset of "conformant", but most users don't know this and assume it means
"fully conformant".
I've tried to work around this in the spec.
On Wed, 6 Apr 2005, Olav Junker Kjær wrote:
>
> There are three types of conformance criteria:
> (1) Criteria that can be expressed in a DTD
> (2) Criteria that cannot be expressed by a DTD, but can still be checked by a
> machine.
> (3) Criteria that can only be checked by a human.
>
> A conformance checker must check (1) and (2). A simple validator which only
> checks (1) is therefore not conformant.
I've put this in the spec, I hope that's ok.
On Thu, 7 Apr 2005, [ISO-8859-1] Olav Junker Kjær wrote:
>
> A DTD or schema in the spec would be redundant anyway, since it would
> only echo what is described in prose.
Indeed.
> DTD validation would be almost useless in the case of WF2, except
> perhaps for catching spelling errors in attribute names. A schema in a
> sufficiently expressive language would go along way, though.
For WF2 it may be far enough, I'm not sure. For HTML5 I'm pretty sure no
Schema language (short of a turing-complete one) is expressive enough.
> I notice that <input type="text" src="some url" checked="true"> is valid
> according to the schema for XHTML.
Indeed.
It'll probably be conformant in HTML5 as well, to be honest, because you
might want to set things up for a dynamic change of |type|. I don't know
where to draw the line there. (Similarly; should empty paragraphs be
conformant? I often use empty paragraphs as somewhere to later fill in
some text.)
> Actually I think it would be beneficial for interoperability and perhaps
> discovery of weaknesses in the spec, if several schemas were developed
> by independent parties during the call for implementation.
Absolutely.
On Thu, 7 Apr 2005, [ISO-8859-1] Olav Junker Kjær wrote:
>
> Actually, the HTML element has a (deprecated!) version attribute, which
> could be used for this purpose. I agree it feels cleaner than using the
> doctype syntax.
It's not clear to me what the purpose would be.
> OTOH authors are going to use doctypes for the forseeable future anyway,
> since they want to trigger standards compliant mode in browsers, so we
> might as well put the doctype to some use.
What use?
On Thu, 7 Apr 2005, [ISO-8859-1] Olav Junker Kjær wrote:
>
> A conformance checker is a rubber stamp. Therefore its quite important
> that a conformance checker actually checks conformance to the spec,
> otherwise it is snake oil.
Hear hear!
> As HTML applications becomes more complex it becomes more important that
> the markup and code is correct, but DTD-validation becomes even less
> sufficient to catch errors. A basic validity error like forgetting to
> close an <b>-tag will not cause the page to stop working. However, a
> syntax error in the initial value of a date control *will* cause the
> page to stop working as intended.
Indeed.
> > now I realise it's to the advantage of existing browser manufacturers
> > to rubber stamp complicated heuristic behaviour they've already solved
> > into a spec (it prevents new entrants from coming along) but how is
> > it to the advantage to the rest of us - understanding specifications
> > becomes harder and harder and relies on the fact that we knew what
> > happened before...
>
> If you are referring to the paragraph about parse errors in
> <http://whatwg.org/specs/web-forms/current-work/#handling> I tend to
> agree with you.
In HTML5 there is less and less that is left up to reverse engineering.
Hopefully that addresses your concern; I hope to continue in this
direction to the point where eventually maybe there will not be any need
for reverse engineering at all.
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg
mailing list