[whatwg] namespaces in html5
David Karger
karger at mit.edu
Mon Jul 18 07:22:42 PDT 2011
OK, per Ian's suggestion I'm starting a new thread on a problem that I'd
hoped html5 would solve for us. As far as I know the problem still
exists so I'm going to raise it here. I'm coming late to the discussion
so will surely retread old territory (for example,
http://www.pacificspirit.com/blog/2008/03/13/namespaces_in_html_readings);
my apologies for that.
I am one of the PIs on the SIMILE project (http://simile.mit.edu/) that
developed the Exhibit data visualization framework
(http://simile-widgets.org/exhibit). The goal of Exhibit is to make it
easy for non-programmers to embed interactive data visualizations in
their web pages. Our approach is to leverage the willingness of many
non-programmers to author html (a key contributor to the early growth of
the web). To do so, Exhibit extends the html vocabulary with attributes
that describe data, visualizations of that data, and interactions with
that data. For example, a tag of the form <div ex:role="view"
ex:viewclass="timeline"> embeds a simile timeline in the html document,
while <div ex:role="facet"> embeds a facet that can be used to filter
the data being viewed in the timeline. Exhibit offers a javascript
library that interprets these tags and implements the requested widgets
on the client side.
You will note that our special attributes use an "ex:" prefix. This
decision was taken in 2006, when it appeared that prefix-based
namespaces were in HTML's future. It addressed our concern that the new
attributes we defined should not collide with those defined by other
projects. Now that namespaces apparently will not be part of html5, we
are wondering how we can properly offer our extended html vocabulary.
In particular, seems highly desirable for us to be able to write Exhibit
pages using html that will validate. Below I'll outline some of the
characteristics of our desired solution, while emphasizing that we'd be
happy to adopt _any_ solution with these characteristics, and are not
wedded to namespaces.
I first justify our approach of html vocabulary extension. A programmer
can argue that a better approach is to offer our javascript library with
a good api, and allow programmers to invoke our widgets programmatically
in script tags. This works fine for programmers, but excludes the large
population of users who are afraid of programming but are willing to
fiddle with html. These users were a potent force in the early days of
the web and we believe they continue to play an important role. They
may not even "know" html; the simplicity and regularity of the syntax
allows them to copy, paste, and even modify page elements they like
without fully understanding them. Specifying data interactions in the
more restricted html syntax instead of programmatic javascript also
opens up the possibility for more effective semantics; for example, it
is easier for a browser to offer an accessible version of a
data-filtering facet if it is explicitly named as a facet rather than
being arbitrary embedded javascript code.
If we accept the need for html language extensibility, there are several
potential approaches. One is html polyglot. Permitting a blended
html/xml representation, polyglot would allow us to extend the
vocabulary via xml namespaces. But polyglot fails to meet our need in
fatal ways. Polyglot restricts the html that can be used, for example
excluding the use of <noscript> tags. Such tags are essential when
using Exhibit, since we want to offer some information presentation for
the case when our visualization javascript is unable to execute. More
generally, polyglot appears to demand much more rigid fidelity to
precise html/xml syntax, for example demanding tbody and colgroup tags
where they are optional in html. This is something that the novice
"programmers" we are targeting are particularly bad at. One of the real
accomplishments of html has been the great efforts of the browser
developers to robustly handle invalid html. We want to continue to
benefit from that effort instead of having pages fail because xml
parsing is performed much more rigidly than html parsing.
Another approach would be to use the catchall html5 data- prefix for
attributes. We could certainly prefix all of our specialized attributes
with the data- prefix, which would turn those attributes valid for
html. This solution is unsatisfactory for two reasons. The first is
that our attributes are not data attributes----we are not using
microformat-oriented data attributes; rather, we are using attributes
that describe visualizations. data- seems a poor choice of prefix. The
second problem that concerns me is attribute collisions. If we use an
attribute like data-role="view", how long will it be before an exhibit
author runs into a situation where a different javascript library is
using the same data-role attribute for a different purpose, which would
make the two libraries incompatible with one another?
In 2006, the predicted namespace prefixes seemed an obvious solution to
our problem: we would define a namespace for our Exhibit framework, and
our javascript would only pay attention to attributes from that
namespace. I have no specific loyalty to namespaces, but I am really
hopeful that html5 will offer us a solution that reflects the issues I
outlined above, namely:
* allow extension of them html5 vocabulary with attributes Exhibit will
use to anchor visualizations,
* such that the resulting html will validate,
* without requiring rigid obedience to the challenging html polyglot
syntax, which is beyond the capabilities of our target novice web authors
* and protecting us from a future in which collisions on choice of
attribute names make our library/vocabulary incompatible with others'
On 7/18/2011 8:46 AM, Ian Hickson wrote:
> On Mon, 18 Jul 2011, David Karger wrote:
>> I wish to submit a comment regarding the (non) use of namespaces in
>> html5. But I hope you might help me track down the relevant issue off
>> which to hang that comment. Some time ago I found a lengthy discussion
>> of whether html5 should use namespaces, with an over-simplified summary
>> being "we haven't seen any important use cases for them, so let's not
>> bother". I would like to respond to that discussion by proposing a use
>> case, but I cannot find it. Searching the bugzilla database has failed.
>> Would you happen to recall participating in this discussion and know
>> where it is?
> You can just post a new thread here.
>
> I recommend describing the problem you wish to address separately from
> your preferred solution. Also I recommend using a word other than
> "namespaces" to describe your preferred solution, as that word is usually
> used in the Web context to refer to some specific designs with known
> problems, and it is likely that you actually want something different.
>
More information about the whatwg
mailing list