[whatwg] Extensible microdata attributes
brettz9 at yahoo.com
Tue Apr 26 19:54:57 PDT 2011
On 4/26/2011 9:55 PM, Benjamin Hawkes-Lewis wrote:
> On Tue, Apr 26, 2011 at 2:32 PM, Brett Zamir<brettz9 at yahoo.com> wrote:
>> That's kind of my purpose though. Sometimes, one does not wish to embed the
>> text itself, but one still wishes the data encoded so it can be retrieved by
>> other means. Why should extensible semantics be restricted to visible
Thanks for the references. While this may be relevant for the likes of
blogs and other documents whose requirements for semantic density is
limited enough to allow such reshaping for practical effect and whose
content is reshapeable by the content creator (as opposed to
republishing of already completed books), for more semantically dense
content, such as the types of classical documents marked up by TEI, it
is simply not possible to expose text for each bit of semantic
information or to generate new text to meet that need. And of course,
even with microformats/microdata as it is now, the semantic content
itself is not necessarily exposed just because text is visible on the page.
The issue of discoverability is I think more related to how it will be
consumed or may be consumed. And even if some pieces of information are
less discoverable, it does not mean that they have no value. For such
rich documents, a lot of attention is being paid to these texts since
they are themselves considered important enough to be worth the time.
If the Declaration of Independence of the United States was marked up
with hidden information about prior emendations, their likely reasons,
etc., or about suspected authors of particular passages, or the United
Nations Declaration of Human Rights were marked up to indicate which
countries have expressed reservations (qualifications) about which
rights, while a browsing application or query tool ought to be able
(optionally) expose this hidden information, there is no automatic need
for the markup to be polluted with extra (hidden) (and especially
URI-based or other non-textual) tags when an attribute would suffice.
For things that are truly important, there may be a great deal of care
put into building up many layers which are meant to be peeled away, and
it is worth allowing some of that information (particularly the
non-textual information, e.g., the conditions of authorship, publisher,
etc.), especially which the original publication did not expose, to be
still selectively revealed to queries and deeper browsing.
If a site like Wikisource (the online library sister project of
Wikipedia's) would be able to offer such officially sanctioned semantic
attributes, classic texts could become enhanced in this way over time,
with the wiki exposing the hidden semantic information, which indeed may
not be as important as the visible text, but with queries by interested
to users, any problems in encoding could be discovered just as well.
While I know most hip web authors and developers are minimalists, can't
we all just get along? Can't those of us interested in such richness,
and with a view to progressively enhancing documents into the far
future, also be welcomed into the web?
More information about the whatwg