[whatwg] Semantic styling languages in the guise of HTMLattributes.

Mike Schinkel mikeschinkel at gmail.com
Wed Dec 27 19:58:04 PST 2006

James Graham wrote:
> Actually, IMHO mpt's point is far broader and consequentially 
> more important than the confines of the original thread. The 
> point, as I understand it, is that machine analysis of 
> "semantic" markup fails if the markup construct is (ab)used 
> in so many different ways that the interpretation of any 
> particular fragment is no longer unambiguous. This is a sort 
> of "heat[1] death" of the original semantics...

It's ironic that you use the term "entropy" here.[1]  Anyway, although in
general I agree with you, you speak in generalities so it is hard to either
concur or disprove your assertions.  

> as the use of 
> an element becomes increasingly disordered (i.e. higher 
> entropy), it becomes impossible to extract any useful 
> information from the use of that element. 

So I'd like to see some specific examples of who you would see things evolve
to the "inevitable" impossibilty?

That said, one of my biggest qualms about "microformats" per se is how they
have defined their community process. I believe their process is likely to
generate more "entropy death" than less.  I proposed alternatives, but they
claimed those alternatives were counter to their vision.  Thus I plan to use
"microformat-like semantic markup" even though it wouldn't be microformats
proper. But that's an entirely different discussion that I'm almost but not
quite prepared to discuss.

So I think the real question is this: is it possible or impossible to define
a process for "microformat-like semantic markup" that can minimize the
chance of "entropy death?" To answer the question one should understand that
a.) even prior to the emergence of "microformat-like semantic markup" we've
had lots and lots of disorder anyway, and 2.) seeing the train speeding to
the end of it's tracks doesn't mean we can stop the train if we want to. On
point #2, I still assert it's more pragmatic and hence better to work to
minimize the damage than to scold the train for "stupidly" speeding up when
approaching the end of it's tracks.

> * Have enough elements. If there are obvious holes that 
> people can't fill with existing elements used properly, they 
> will reuse existing elements in new ways so increasing their entropy.

Agreed.  That's what we get for pursuing pie-in-the-sky semantic web
exclusively while ignoring the evolution of HTML, for how long?  Also it's
what we get now for trying to put everything into HTML5 instead of planning
to rapidly release 5, 6, 7, etc.

> * Don't have too many elements: If there are too many 
> elements people won't understand them all and will reuse 
> existing elements in the "wrong" way, so increasing their entropy.

<Elements> or @attributes?  Anyway, I doubt there will be misuse if the
<Elements>/@attributes have clear semantics other than possibly people not
using them when they could have. Of course elements with names like <div>
and <span> (what were they thinking when they named those?!?) are the type I
believe you are referring to.

> * Make the semantics of elements well defined: Start the 
> elements in a "low entropy" i.e. highly ordered state. Make 
> it obvious how the element is intended to be used (and 
> restrict the valid uses to ones that can be discriminated by 
> machine) so that fewer people accidentally abuse it.

Interestingly, Dion Hichcliffe had a great article[2] that argued the best
way to get a good outcome is to minimize structure at the beginning until
the patterns emerge, then layer structure on top of those patterns.  Think
of the wiki. At the beginning, it was "the simpliest thing that would work."
Had someone architected it in advance of use, they would have ended up with
Lotus Notes!  :-) And although Notes was sold to lots of corporations,
Mediawiki is far more usuable for average people than Notes; the latter
takes a salesmen to convince IT and then an IT staff to deliver edicts that
"thou shalt use."  

While his article focused on entreprise intranets, one could argue that
microformats simply might be the way of letting the world to the design for
the needs of future HTML, assuming the next version of HTML empowers people
enough to do so, and that we don't have a wait another decade before HTML6.

> * Have some "high entropy" elements. This is the 
> counterintuitive one. 
> The goal, remember, is to extract as much information as 
> possible from the semantically well-defined elements. 
> However, in many situations there will not be a relevant 
> element to use, the publishing setup will not be optimized 
> for selecting the correct semantic element (think WYSIWYG 
> editors), or the author will not be sufficiently familiar 
> with the language semantics to make a well-informed choice 
> about the right element to use. In this case providing (and 
> encouraging the use of!) a set of high entropy "bit-bucket" 
> elements that are semantically meaningless is  very 
> beneficial because they prevent the entropy increase 
> associated with the abuse of the semantic elements. The 
> increasing misuse of <em> as a "more semantic" <i> is an 
> example of what happens when this policy is not followed.

Hmm. I think in this last paragraph you made the point I just typed prior to
reading the last paragraph!

> * Allow easy extensions. Having an extension mechanism for 
> those who need more functionality is one way to stop the 
> abuse of existing elements. This has to be sufficiently easy 
> to use that the it can be widely adopted but powerful enough 
> that it can replicate all the semantic features of the host language.

YES!!! (can you tell I agree?  :-)  I would actually love to be involved in
designing those as I've done some preliminary work on them, but only if
there was a very good chance we'd get to see extension elements added. I
can't afford to spin my wheels that much just to pontificate.
-Mike Schinkel

[1] I came to believe I had realized some aspects of software and
development several years ago, and I registered "softwareentropy.com" with
plans to blog about it until I had enough content for a book. That project
is still on the backburner, unfortunately. :-(

[2] http://blogs.zdnet.com/Hinchcliffe/?p=57

More information about the whatwg mailing list