[whatwg] namespaces in html5

David Karger karger at mit.edu
Mon Jul 18 07:22:42 PDT 2011

OK, per Ian's suggestion I'm starting a new thread on a problem that I'd 
hoped html5 would solve for us.  As far as I know the problem still 
exists so I'm going to raise it here.  I'm coming late to the discussion 
so will surely retread old territory (for example, 
my apologies for that.

I am one of the PIs on the SIMILE project (http://simile.mit.edu/) that 
developed the Exhibit data visualization framework 
(http://simile-widgets.org/exhibit).  The goal of Exhibit is to make it 
easy for non-programmers to embed interactive data visualizations in 
their web pages.  Our approach is to leverage the willingness of many 
non-programmers to author html (a key contributor to the early growth of 
the web).  To do so, Exhibit extends the html vocabulary with attributes 
that describe data, visualizations of that data, and interactions with 
that data.  For example, a tag of the form <div ex:role="view" 
ex:viewclass="timeline"> embeds a simile timeline in the html document, 
while <div ex:role="facet"> embeds a facet that can be used to filter 
the data being viewed in the timeline.  Exhibit offers a javascript 
library that interprets these tags and implements the requested widgets 
on the client side.

You will note that our special attributes use an "ex:" prefix.  This 
decision was taken in 2006, when it appeared that prefix-based 
namespaces were in HTML's future.  It addressed our concern that the new 
attributes we defined should not collide with those defined by other 
projects.  Now that namespaces apparently will not be part of html5, we 
are wondering how we can properly offer our extended html vocabulary.  
In particular, seems highly desirable for us to be able to write Exhibit 
pages using html that will validate.  Below I'll outline some of the 
characteristics of our desired solution, while emphasizing that we'd be 
happy to adopt _any_ solution with these characteristics, and are not 
wedded to namespaces.

I first justify our approach of html vocabulary extension.  A programmer 
can argue that a better approach is to offer our javascript library with 
a good api, and allow programmers to invoke our widgets programmatically 
in script tags.  This works fine for programmers, but excludes the large 
population of users who are afraid of programming but are willing to 
fiddle with html.  These users were a potent force in the early days of 
the web and we believe they continue to play an important role.  They 
may not even "know" html; the simplicity and regularity of the syntax 
allows them to copy, paste, and even modify page elements they like 
without fully understanding them.  Specifying data interactions in the 
more restricted html syntax instead of programmatic javascript also 
opens up the possibility for more effective semantics; for example, it 
is easier for a browser to offer an accessible version of a 
data-filtering facet if it is explicitly named as a facet rather than 
being arbitrary embedded javascript code.

If we accept the need for html language extensibility, there are several 
potential approaches.  One is html polyglot.  Permitting a blended 
html/xml representation, polyglot would allow us to extend the 
vocabulary via xml namespaces.  But polyglot fails to meet our need in 
fatal ways.  Polyglot restricts the html that can be used, for example 
excluding the use of <noscript> tags.  Such tags are essential when 
using Exhibit, since we want to offer some information presentation for 
the case when our visualization javascript is unable to execute.  More 
generally, polyglot appears to demand much more rigid fidelity to 
precise html/xml syntax, for example demanding tbody and colgroup tags 
where they are optional in html.  This is something that the novice 
"programmers" we are targeting are particularly bad at.  One of the real 
accomplishments of html has been the great efforts of the browser 
developers to robustly handle invalid html.  We want to continue to 
benefit from that effort instead of having pages fail because xml 
parsing is performed much more rigidly than html parsing.

Another approach would be to use the catchall html5 data- prefix for 
attributes.  We could certainly prefix all of our specialized attributes 
with the data- prefix, which would turn those attributes valid for 
html.  This solution is unsatisfactory for two reasons.  The first is 
that our attributes are not data attributes----we are not using 
microformat-oriented data attributes; rather, we are using attributes 
that describe visualizations.  data- seems a poor choice of prefix.  The 
second problem that concerns me is attribute collisions.  If we use an 
attribute like data-role="view", how long will it be before an exhibit 
author runs into a situation where a different javascript library is 
using the same data-role attribute for a different purpose, which would 
make the two libraries incompatible with one another?

In 2006, the predicted namespace prefixes seemed an obvious solution to 
our problem: we would define a namespace for our Exhibit framework, and 
our javascript would only pay attention to attributes from that 
namespace.  I have no specific loyalty to namespaces, but I am really 
hopeful that html5 will offer us a solution that reflects the issues I 
outlined above, namely:
* allow extension of them html5 vocabulary with attributes Exhibit will 
use to anchor visualizations,
* such that the resulting html will validate,
* without requiring rigid obedience to the challenging html polyglot 
syntax, which is beyond the capabilities of our target novice web authors
* and protecting us from a future in which collisions on choice of 
attribute names make our library/vocabulary incompatible with others'

On 7/18/2011 8:46 AM, Ian Hickson wrote:
> On Mon, 18 Jul 2011, David Karger wrote:
>>    I wish to submit a comment regarding the (non) use of namespaces in
>> html5. But I hope you might help me track down the relevant issue off
>> which to hang that comment.  Some time ago I found a lengthy discussion
>> of whether html5 should use namespaces, with an over-simplified summary
>> being "we haven't seen any important use cases for them, so let's not
>> bother".  I would like to respond to that discussion by proposing a use
>> case, but I cannot find it. Searching the bugzilla database has failed.
>> Would you happen to recall participating in this discussion and know
>> where it is?
> You can just post a new thread here.
> I recommend describing the problem you wish to address separately from
> your preferred solution. Also I recommend using a word other than
> "namespaces" to describe your preferred solution, as that word is usually
> used in the Web context to refer to some specific designs with known
> problems, and it is likely that you actually want something different.

More information about the whatwg mailing list