[whatwg] Semantic styling languages in the guise of HTML attributes.
Matthew Paul Thomas
mpt at myrealbox.com
Mon Dec 25 04:50:31 PST 2006
On Dec 22, 2006, at 3:23 AM, Benjamin Hawkes-Lewis wrote:
>
> Henri Sivonen wrote:
> ...
>> Also, it seems to me that the usefulness of non-heuristic machine
>> consumption of semantic roles of things like dialogs, names of
>> vessels, biological taxonomical names, quotations, etc. has been
>> vastly exaggerated.
>
> I'm not entirely sure what "non-heuristic machine consumption" is,
An example of non-heuristic machine consumption is where Google
Glossary thinks: "In an HTML 3.2 or earlier document containing the
code '<dl><dt>foo<dt> <dd>bar</dd></dl>', 'bar' is a definition of
'foo'". (It probably thinks the same about HTML 4 documents, too, which
is applying a small "ignore that nonsense about dialogues" heuristic.)
An example of heuristic machine consumption is where Google Glossary
thinks: "In an HTML document containing the code '<p><b>foo:</b>
bar</p>', 'bar' is probably a definition of 'foo', especially if the
page has several consecutive paragraphs with that structure and
different bold text."
Non-heuristic machine consumption fails when semantic elements are
abused, and becomes practical when elements have multiple popular
meanings (examples of the latter include <dl> in HTML 4, and <p> in
HTML 5). Heuristic machine consumption fails occasionally by the very
nature of heuristics (examples currently include
<http://www.google.com/search?q=define:author> and
<http://www.google.com/search?q=define:editor>.)
--
Matthew Paul Thomas
http://mpt.net.nz/
More information about the whatwg
mailing list