[whatwg] Presentational safety valves

Matthew Paul Thomas mpt at myrealbox.com
Tue Jan 2 13:36:48 PST 2007

On Dec 28, 2006, at 1:58 PM, James Graham wrote:
> Mike Schinkel wrote:
>> Matthew Paul Thomas wrote:
> ...
>>> Non-heuristic machine consumption fails when semantic elements are 
>>> abused, and becomes practical when elements have multiple popular 
>>> meanings (examples of the latter include <dl> in HTML 4, and <p> in 
>>> HTML 5). Heuristic machine consumption fails occasionally by the 
>>> very nature of heuristics (examples currently include 
>>> <http://www.google.com/search?q=define:author> and
>>> <http://www.google.com/search?q=define:editor>.)
>> The origin of this thread was my request for adding attributes to all
>> elements to support microformat-like semantic markup. Based on the 
>> context of your reply, it seems you are agreeing with Matthew Raymond 
>> in his assertion that using microformat-like semantic markup is A Bad 
>> Thing(tm). Am I understanding your position correctly? (If I'm not, 
>> please forgive me.)
> Actually, IMHO mpt's point is far broader and consequentially more 
> important than the confines of the original thread.

Broader, yes (and I should have changed the Subject). I don't know 
about more important, because I have no experience in "microformat-like 
markup", and I have no idea how important it will be. So I wasn't 
commenting on it at all (though Matthew Raymond's arguments seem 

> The point, as I understand it, is that machine analysis of "semantic" 
> markup fails if the markup construct is (ab)used in so many different 
> ways that the interpretation of any particular fragment is no longer 
> unambiguous. This is a sort of "heat[1] death" of the original 
> semantics; as the use of an element becomes increasingly disordered 
> (i.e. higher entropy), it becomes impossible to extract any useful 
> information from the use of that element. This is critical in the 
> proper design of semantic markup languages because one wishes to stave 
> off the heat death as long as possible so that, as far as possible, 
> UAs can perform useful functions based on the information in the 
> markup (e.g. render it to a media for which the content was not 
> explicitly designed). Obviously I don't know how to achieve this but 
> there are a few things to consider:
> * Have enough elements.
> ...
> * Don't have too many elements:
> ...
> * Make the semantics of elements well defined:
> ...
> * Have some "high entropy" elements.
> ...
> * Allow easy extensions.
> ...

I think this is exactly right. Another point I would add is "implement 
the semantic benefit early and often". The earlier and more widely 
software is distributed that takes advantage of the semantics, the more 
easily people can see whether they are using semantic markup 
appropriately. I hinted at this earlier when I said that whether 
<section> becomes a semantic element "will depend on who is faster: UA 
vendors distributing software that prominently takes advantage of the 
structure <section> is supposed to provide, or eager tech Weblog 
authors misguidedly replacing all the occurrences of <div> with 
<section> in their templates in an attempt to be 'more semantic'."

The "Don't have too many elements" guideline bears on Joe Clark's 
complaint that "'HTML5' replicates HTML's obsession with 
computer-science and math elements" 
<http://blog.fawny.org/2006/10/28/tbl-html>. It is true that HTML's few 
semantic elements are biased toward computer science (but not math), 
but that's because computer-science people are those most likely to 
bother with semantic markup at all (Joe being a notable exception). And 
adding representative elements from other fields of endeavor would 
likely result in too many elements overall.

> This post was brought to you by the society for dodgy physical 
> analogies concocted in the middle of the night.
> ...

As another analogy, in a recent message to Ian I referred to such 
presentational elements as "safety valves".

Whenever someone uses <div>, don't say "alas, that's a hole in HTML"; 
say "hooray, that's someone who isn't misusing <blockquote>".

Matthew Paul Thomas

More information about the whatwg mailing list