[whatwg] Annotating structured data that HTML has no semantics for
Tim Tepaße
tim.tepasse at uni-dortmund.de
Mon May 11 20:06:14 PDT 2009
A cursory glance on the new section 5 raises two questions on
indirection:
> (Note the <meta>s in the last example -- since sometimes the
> information
> isn't visible, rather than requiring that people put it in and hide it
> with display:none, which has a rather poor accessibility story, I
> figured
> we could just allow <meta> anywhere, if it has a property=""
> attribute.)
That seems to be a solution optimised for extremely invisible metadata
but not for metadata which differs from the human visible data.
Imagine as an example the simple act of marking up a number (and
ignoring what the number denotes). For human consumption a thousands
seperator is often used, the type of seperator differs by language,
locale and context. Just in my little word I see on regular basis the
point, the comma, the space, the thin space and sometimes the the
apostrophe. Parsing different representations of numbers would be a
chore. The value of textContent of the element <span
itemprop="com.example.price">€ 1thinsp;000thinsp;000,—</
span> is clearly unusable, demanding an additional invisible <meta
property="com.example.price" content="1000000">.
My irritation lies in the element proliferation, requiring one element/
attribute combination for machines, one element/text content
combination for humans. Of course, any sane author would arrange both
elements in a close relation, as parent/child or sibling but there
would be still two different elements to maintain, leading to a higher
cognitive load. Not just for authors but also for programmers: a
fluctating price had to be actualized on two different elements; tree
walking DOM scripts had to take meta-Elements in account. Furthermore
it clashes with the familiar habit of other elements in HTML. A
hyperlink is one element with a machine-readable attribute and human-
readable text content. A citation is one element with a machine-
readable reference and human-readable text content. The same model is
used in <meter>, <progress>, <time>, <abbr> ... but not in user-
defined objects. I'd prefer an additional @content-like attribute
which supersedes the text content and maybe even the default values of
the other value-bearing elements, reducing two different elements to
maintain or change to just one.
> Instead, let us try using the regular "IDREF" functionality that
> HTML uses
> in a variety of other places, like <label for="">. For this we'll
> need a
> new attribute, but unfortunately we can't use about="" (which would
> be the
> obvious name to use), because that would conflict with RDFa, so
> instead
> we'll use subject="":
I'm slighty irritated by the implied change from active, possessive
formulating (“The cat has the name Hedral.”) to something more passive-
y (“Hedral is a name owned by that cat.“). My mental model for
property relationships orients itself more on the former wording; link
relationships are similar in that regard. @about/@subject are like
@rev; a @resource alias @rel would feel more natural. There are
practical relation by the missing @resource, I think. Imagine a
document documenting an household and a household vocabulary which
allows triples of <human>s which are in an <owner> relationship to a
<cat>. Given an household of two humans and one cat; how does one
markup the assumption that the cat has two owners?
More information about the whatwg
mailing list