[html5] Identifying HTML 5 documents? (vs. alternate flavors)
Jim Correia
jim.correia at pobox.com
Fri Feb 8 06:33:08 PST 2008
On Feb 8, 2008, at 4:44 AM, Henri Sivonen wrote:
> On Feb 4, 2008, at 18:39, Jim Correia wrote:
>> On Feb 4, 2008, at 11:24 AM, Henri Sivonen wrote:
>>> On Feb 4, 2008, at 17:28, Jim Correia wrote:
>>>
>>>> I know there has been some discussion about this on the forum. But
>>>> after having read through the draft spec and the FAQ, I'm still a
>>>> little unclear about how I can auto-detect that a document is using
>>>> HTML 5.
>>>
>>> The short answer is that HTML5 by design tries to discourage you
>>> from trying to do that.
>>
>> I can understand that discouraging user-agents from doing this
>> might be a good thing. At the same time, it appears to make life
>> more difficult for those of us who produce authoring tools which
>> must support legacy formats alongside HTML 5.
>
> If the spec had a centrally-prescribed way for authoring tools to do
> spec versioning, people would be tempted to suggest all sorts of
> version-based conditional behavior in browsers.
People may suggest it anyway. And some browser vendors may even oblige
them. Meanwhile, without a sanctioned way to clearly identify HTML 5,
it has been made difficult for those of us who want to do the right
thing because some to avoid some hypothetical wrongness on someone
else's part.
(If browser vendors want a version identifier, there's nothing
stopping them from inventing one. Or several. It is not as if
proprietary browser-specific
> I suppose we could add a modeline attribute on the root element if
> its content were a non-standard tool-specific configuration
> identifier to prevent general consuming apps from performing mode
> switching on it.
>
> http://lists.w3.org/Archives/Public/public-html/2007JanMar/0433.html
Thanks for the pointer. In that message, for point 4, you wrote:
If HTML6 is a superset of HTML5, writing HTML5 and checking with an
HTML6 conformance checker won't be a problem. If HTML6 deprecates or
obsoletes parts of HTML5, then we won't want to make it too easy for
people to keep using the bad stuff without mentioning it to them, will
we?
My experience of having shipping software to users and having to
support those users tells me this is going to be a problem. Suppose
they are using version 12 of my tool which does HTML 5 conformance
checking; checking their documents reports no errors. But they have
used elements or attributes which are deprecated in HTML6. They
upgrade to version 13 of my tool which supports HTML6, and now
checking those very same documents reports hundreds of errors. They
won't have read the release notes, or the documentation, or...
Instead, they'll write to my technical support address and complain
that the conformance checker is broken because yesterday there were no
errors in their documents and today there are hundreds.
(We did go through a painful period if this once in the past. Before I
took over this part of the product, the checker was quite in adequate,
but people used it in ignorant bliss. When we shipped an updated
checker that found and reported many conformance issues that needed
fixing, the reaction was that we broke the checker, not that the
documents had always been broken.)
If someone wants to keep checking against the definitions of HTML5 in
the era of HTML6, I think it is reasonable put the burden of choosing a
different version from a pop-up menu in the conformance checker UI on
the person who wants to do legacy checking.
I agree that it is reasonable that a tool which supports HTML6
conformance checking should default to HTML6.
The issue about - deprecated features and surprised users getting
errors in previously error-free documents still stands.
Another usability issue is also the use case of users who work on
multiple trees, and need to have mixed conformance checking (without
constantly reconfiguring the tool) until such time as they can move
their legacy HTML5 documents with deprecated elements over to the
HTML6 standard.
>>> Wouldn't that kind of approach fail to detect that a set of
>>> documents isn't fully HTML5-compliant if a document in the set is
>>> autodetected as non-HTML5 and passes checks as whatever it was
>>> detected as?
>>
>> I'm not sure I understand the question.
>
> Suppose I want to see if the .html files in a directory hierarchy
> are HTML5-compliant. If the documents can declare themselves as non-
> HTML5 and avoid being checked as HTML5, I get the wrong answer.
Now I see what you are getting at. I currently don't support that
operation, and it is not something I typically do. But I see your
point - one could check a tree full of documents against HTML 5, and
if they were compliant, post process them to change the doctype (or
remove it for the xml serialization.)
> If there are issues we don't foresee now but we see when the
> successor of HTML5 is being defined, we can make the successor have
> a distinguishing feature at that time.
After reading through the message you pointed through, as well as
others found via searching, it sounds as though we've been around this
block a time or two by now and that the spec authors are rather
inflexible about this point (and no new arguments have swayed them)?
I also posted this to the help mailing list, and after having done so
wondered if the specs or implementors (which hasn't seen traffic in a
long time) may have been a more appropriate forum.
>> Again, this is a similar problem to HTML5. Without a heuristic that
>> that says XHTML syntax, no doctype, probably XHTML 5 it seems like
>> there isn't a good way to infer an author's intent when the
>> document lives in a tree of documents targeting various
>> specifications.
>
> Other XML editors solve this using an editor-specific PI.
I currently offer something similar to this for document fragments.
(It takes the form of a comment, not a PI, since it has to work in
HTML syntax as well as XHTML syntax.)
The problem with this approach is that, if people are required to add
an editor specific PI to their document for any reason, there is
resistance our outright refusal. The reasons can be any one of
- Only a small number of as part of a larger team use your tool.
- I'm not allowed to commit editor specific hacks to the repository.
- etc.
Jim
More information about the Help
mailing list