[html5] Identifying HTML 5 documents? (vs. alternate flavors)
hsivonen at iki.fi
Mon Feb 4 08:24:17 PST 2008
On Feb 4, 2008, at 17:28, Jim Correia wrote:
> I know there has been some discussion about this on the forum. But
> after having read through the draft spec and the FAQ, I'm still a
> little unclear about how I can auto-detect that a document is using
> HTML 5.
The short answer is that HTML5 by design tries to discourage you from
trying to do that.
> (Or more precisely, that the author of the document intended
> it to be conformant to HTML 5.)
HTML5 is designed so that this doesn't need to be asserted to the
other party when sending HTML5 content to a consuming client. In the
case of an author who is conformance checking his own stuff (as
opposed to communicating with another party), the theory goes that the
authors simply chooses to use a tool that only supports HTML5 or that
is configured to support HTML5.
This might be a bit inconvenient if during a transition period the
author also wants to target legacy flavors of HTML in some of his
> I have a conformance checker tool which needs to autodetect the flavor
> of HTML in use so it can determine which particular set of conformance
> tests to apply to the document.
Do I guess correctly that this will be part of a text editor for Mac?
> (We may be talking about a single
> document, or traversing a directory tree and processing all documents
> in the tree. In either case, the document type should be auto
Wouldn't that kind of approach fail to detect that a set of documents
isn't fully HTML5-compliant if a document in the set is autodetected
as non-HTML5 and passes checks as whatever it was detected as?
> For HTML syntax, the shorted form of the doctype "<!DOCTYPE HTML>" is
> required. This is sufficiently different from all previous doctypes
> that it can be mapped to HTML 5. But since there is no version
> information included in the doctype, what happens when the successor
> to HTML 5 comes out?
When the successor of HTML5 comes out, authors are supposed to create
content according to the requirements of the successor and no longer
according to HTML5.
This assumes, of course, that whoever defines the successor of HTML5
define the successor reasonably, so that conforming HTML5 documents
remain conforming and mean the same thing according to the successor.
The obvious problem with that assumption is that so far definers of
HTML flavors have had a tendency to deprecate or obsolete features. We
can hope that the definers of the successors of HTML5 don't seek to
deprecate or obsolete anything unless the deprecated or obsoleted bit
is so harmful that telling every author that their documents no longer
conform is of paramount importance.
> For XHTML syntax, the doctype is to be omitted. In this situation, how
> should I autodetect that we are using XHTML 5 as opposed to some other
By design, you shouldn't. Validator.nu defaults to XHTML5 + SVG 1.1 +
MathML 2.0 for application/xhtml+xml. I suggest doing the same
for .xhtml (assuming that the tool in question is a text editor
operating on local files): defaulting to the latest Web-relevant
compound document format combination supported by the checker.
hsivonen at iki.fi
More information about the Help