[whatwg] HTML 5 vs. XHTML 2.0
Matthew Thomas
mpt at myrealbox.com
Sun Nov 14 01:13:46 PST 2004
On 13 Nov, 2004, at 12:52 PM, Laurens Holst wrote:
> ...
> Be that as it may be, you're forgetting something - HTML is not just
> for the web, it is also a document markup language for many (which can
> and is of course often used and actually specifically aimed at the
> web). At my job we currently create the documentation files for our
> product with a transformation of our documents, which use a 'custom'
> ...
What Henri said. If long-term fidelity is important, HTML should be
something you convert to, not your native format.
<http://diveintomark.org/archives/2003/01/13/semantic_obsolescence>
> ...
> I don't think this is a spec just for 'the ignorant mass'. A spec
> aimed at them can hardly be taken seriously, because it will take a
> lot to make them learn.
There are more of them than there are of you and me, and we benefit
from their documents on the Web. (You might reminisce about the days
before GeoCities and Xanga and Time Cube, but those were also the days
before Google and eBay and Wikipedia.)
> ...
> Anyway, let me stress again that for 'HTML 5' I am highly in favour of
> adopting XHTML 2.0 with the unused HTML 4.01 tags marked 'deprecated'
> (this is an important difference from XHTML 2.0 which removes them
> altogether), and perhaps some additions. Because XHTML 2.0 is
> definitely a more serious markup language.
The Web is not, and since about 1995 has not been, a serious medium. It
is much more often used for selling books than for publishing them, for
simulating sex than for discussing it, and for posting opinions than
for posting facts. For the Web's pockets of seriousness you might use
XHTML 2.0, but XHTML 2.0 is rather primitive; why not use TEI P4
instead?
> And I'd say HTML 5 being compatible with XHTML 2.0 is a great merit
> for both.
Maybe, but backward compatibility is expressly a design goal of HTML 5
<http://www.whatwg.org/charter#back-compat>, while it is expressly not
a design goal of XHTML 2.0
<http://www.w3.org/TR/2004/WD-xhtml2-20040722/
introduction.html#backCompat>. Such divergent processes are unlikely to
produce the same result.
> I'm getting the impression that we are here discussing much that has
> already been through thoroughly on the XHTML 2.0 working group.
Probably, though for the compatibility reason given above, our
conclusions may often be different.
> For example the quotes thing - in XHTML there's no <q> anymore but
> there's <quote>, a choice very likely made because of the exact same
> concerns raised overhere (being inconsistency between <q>
> functionality, which a new tag would solve). Or removing <acronym>,
> <big> and <small>. The accesskey functional choice they made sounds
> pretty decent, from what I hear here. <var> is used (just maybe not by
> you, but I have several times),
I use <var> whenever appropriate, which is about once a year, but I
recognize that it is unlikely ever to have any semantic usefulness
(because variable names aren't unique enough). I use <q> much more
often, and I will weep hot tears if/when it is abolished, but I
recognize it is a poorly-supported, backward-incompatibly-confusing
element, with hardly any semantic usefulness, and an uneasy
relationship with English punctuation (except in en-GB-hixie and
similar dialects).
> and the functionality of <cite> is greatly enhanced, making it a much
> more useful tool,
My list of deprecable items included cite=, not <cite>. cite= is mostly
useless for three reasons. First, it's invisible, so authors don't use
it, so it can't be relied on or aggregated usefully. Second, it expects
a URI, but the cited material isn't necessarily represented online
<http://lists.w3.org/Archives/Public/www-html/2003May/0214.html>.
Third, it doesn't relieve authors from having to provide text citations
before/after the quote as well (if they didn't, the text would be
nonsensical in hypertext-less media such as printouts or telephone
conversations).
> which can also be employed for data mining (I've seen a similar thing
> on dive into mark's blog once iirc).
> ...
You mean posts by citation
<http://diveintomark.org/archives/2002/12/27/pushing_the_envelope>. I
hope "Hixie said I was using [<cite>] correctly"
<http://diveintomark.org/archives/2003/01/19/influences> was an
over-broad interpretation of Ian's words, because (a) Ian has mentioned
"'clarifying' the definition of <cite>"
<http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2004-November/
002329.html>, and (b) while Mark's uses of <cite> matched the example
given in the HTML 4.01 spec
<http://www.w3.org/TR/REC-html40/struct/text.html#edef-CITE>, they did
not match the default presentation in all visual UAs, nor the resultant
use by most Web authors.
(Specifically, I think the most coherent and backward-compatible
"clarification" would be to restrict <cite> to titles of works, because
inviting authors to use it for names of people as suggested in the HTML
4.01 example would require authors to override <cite>'s italic-ness
frequently, making them more likely to abandon the element completely.)
--
Matthew Thomas
http://mpt.net.nz/
More information about the whatwg
mailing list