[whatwg] Don't change the semantics of elements
zcorpan at gmail.com
Fri May 11 05:54:55 PDT 2007
In response to
I contacted Tommy Olsson to discuss the issue further, and we agreed to
forward the discussion to the list. I've translated it from swedish so any
grammatical errors are my fault. :-)
It seems like the premises of the working group are something completely
different from what I personally would have wished for further development
of one of the most important standards on the web. That is yet another
reason I don't want to be involved in the game. :)
> In the article you write:
> Instead, it looks as if they are going to redefine the semantics of
> existing element types so that old-school documents from the Bad Old
> Days will be conforming to the new specification!
> Hmm. It's probably more on a case-to-case basis.
One single such case is bad enough, in my opinion. W3C have already done
this, and we don't quite know what the result will be. They changed the
semantics of DL from definition list to some kind of generic value pair
list. For instance, they say that DL can be used for markup up dialogues.
How will that affect an application that has relied on that a DL is a
definition list? One example is the DEFINE: feature in Google.
In my opinion it would have been considerably much better if they let DL
continue be a definition list and instead added a new element type for
value pair lists. Although I suspect that Microsoft was holding it back in
that respect, by not wanting to put any work into development of IE. (This
was before Firefox's successes forced them to do so anyway.)
> When it comes to which semantic <p> should have you first need to ask the
> question who will benefit from the definition of <p>? The one who authors
> HTML by hand? The one who implements a WYSIWYG editor? Above mentioned
> analysis applications? Browser manufacturers? Several/all of them?
This is where I see the line between my point of view and the working
group's. You look at the benefit of each specific definition, which I
think means that you miss the forest for all the trees.
My standpoint is based on that I learned HTML during the time the specs
were at cern.ch. It was before the W3C was grounded and before HTML got
any version number. HTML was then a semantic markup language that was very
biased towards scientific documents, for natural reasons.
It became natural for me to think of HTML elements from a semantic point
of view. The element type has nothing to do with presentation, but shall
only mark up what things are. Unfortunately the range of semantic element
types is very limited, but at least we can mark up headings, paragraphs,
lists and tables.
The web's development during the second half of the 90's went in a totally
different direction, when designers and happy amateurs took over. I
thought it went off the track, since HTML for me wasn't a presentational
language. The W3C agreed, and eventually released CSS, but the damage was
To me the question of who "benefits" from the definition of <p> is
irrelevant. The definition already exists and is unambigous. A <p> tag
shall mark up a textual paragraph, and nothing else. Then of course there
are gray areas: is a byline a special case of a paragraph, or something
We look at the overall picture from completely different perspectives, and
I don't think we will reach a common vision. Your outlook is probably in a
vast majority; 99.999% of those who create web pages have never even read
the HTML4 specification, after all, but sees HTML from a presentational
aspect. For them <p> is probably just a <div> with predefined margins,
just like the HTML5 specification seems to suggest.
> Webbläsare kan inte göra så mycket skoj med <p>. Oavsett vad specen säger
> att en <p> representerar.
But the web doesn't just about browsers. The web is (or should be, anyway)
about publishing information. One way to take part of this information is
to present the documetn in a browser, but it's far from the only
conceivable manner. Of you think a bit forwards and have some imagination
you can probably come up with many interesting areas of use for
information on the web, presuming that it is marked up in a sensible way.
For the semantic web, that the W3C is talking about, is something I find
> Analysing applications that operate on the entire web without prior
> agreement with the producer cannot rely on that <p> == paragraph, because
> the web doesn't look like that and we can't change it. Regardless of what
> the spec says such apps will thus have to implement heuristics in order
> to decide what is a paragraph. (If there is a prior agreement with the
> producer it still doesn't matter what the spec says.)
It depends. An analysing application that tries to create some sort of
sense of today's tag soup has a strong sysifos work in front of it. But an
analysis application that expects semantic correctness would, if it became
popular, be able to affect things in the right direction. Today's SEO
trend has to some degree lead to better understanding for semantics, e.g.
by spiders rewarding correctly marked up headings before tag soup with
FONT and B elements.
> A WYSIWYG editor probably has a hard time knowing whether what the user
> writes is a paragraph or not.
Yes, I have so far not seen one WYSIWYG editor that facilitates semantic
correctness. I also can't imagine how such a user interface would look
like. But surely there should be wiser minds in this world that can come
up with something?
> From that point of view it doesn't really matter how <p> is defined in
> the spec -- it doesn't change reality
No, I think it matters a lot. For those who don't read the spec (i.e.
99.999%) it obviously has no significanse at all, but there has to be an
unambigous semantic definition for each element type for the little
minority who actually want to do things right.
> Then the question is what is the harm that <p> is used by more things
> than just for paragraph. Who is harmed by markup such as
> <form><p><label>Search: <input name="q"></label></p></form>
The one who has read the earlier HTML specifications and thinks that <p>
marks up a textual paragraph. Obviously not the one who looks at the
result in a graphical browser, but maybe the one who uses a completely
Sure you can hit in screws with a hammer. There won't come a SWAT team
with murderous carpenters and drag you away to the prison for that. But
those with a little piece of pride of his profession still uses a chisel
or a screw driver.
> ...? Why is
> <form><div><label>Search: <input name="q"></label></div></form>
It's only marginally better, by using a semantically neutral container
instead of abusing <p>. The correct thing is naturally to use a <fieldset>.
This with semantic meaning and correct markup is hard to mediate. I notice
that daily both at my job and on forums such as SitePoint. The visual
outlook ("the most important thing is that it looks good") is completely
dominating before the structual ("it shall be correct too").
I don't imagine that the world will get a collective aha experience and
that HTML in the future will get used the way it was intended. But it
doesn't stop me from at least preaching now and then for those who are
interested. I can't save the web from the tag soup march, but I might be
able to save a handful of people from getting stuck.
Let me just clarify that this isn't about me being conservative and
opponent to changes. I don't grumble about that "it was better before" and
sniff at "the youth of today". Possibly you can draw similarities to
authors of letters to the editor column who sign their works with "friend
of order". :)
It's simply that I happen to think that the original idea with HTML is a
good one. Let HTML mark up structure and sematnics, and leave all
presentation to CSS. To further develop HTML, add more semantic elements
that experience shows we need; such as <nl> (navigaion list) and an
element type for value pair lists.
More information about the whatwg