[whatwg] Semantic styling languages in the guise of HTML attributes.

Matthew Paul Thomas mpt at myrealbox.com
Mon Dec 25 04:50:31 PST 2006

On Dec 22, 2006, at 3:23 AM, Benjamin Hawkes-Lewis wrote:
> Henri Sivonen wrote:
> ...
>> Also, it seems to me that the usefulness of non-heuristic machine 
>> consumption of semantic roles of things like dialogs, names of 
>> vessels, biological taxonomical names, quotations, etc. has been 
>> vastly exaggerated.
> I'm not entirely sure what "non-heuristic machine consumption" is,

An example of non-heuristic machine consumption is where Google 
Glossary thinks: "In an HTML 3.2 or earlier document containing the 
code '<dl><dt>foo<dt> <dd>bar</dd></dl>', 'bar' is a definition of 
'foo'". (It probably thinks the same about HTML 4 documents, too, which 
is applying a small "ignore that nonsense about dialogues" heuristic.)

An example of heuristic machine consumption is where Google Glossary 
thinks: "In an HTML document containing the code '<p><b>foo:</b> 
bar</p>', 'bar' is probably a definition of 'foo', especially if the 
page has several consecutive paragraphs with that structure and 
different bold text."

Non-heuristic machine consumption fails when semantic elements are 
abused, and becomes practical when elements have multiple popular 
meanings (examples of the latter include <dl> in HTML 4, and <p> in 
HTML 5). Heuristic machine consumption fails occasionally by the very 
nature of heuristics (examples currently include
<http://www.google.com/search?q=define:author> and

Matthew Paul Thomas

More information about the whatwg mailing list