[whatwg] Trying to work out the problems solved by RDFa

Sun Jan 11 08:52:40 PST 2009

Benjamin Hawkes-Lewis ha scritto:
> On 11/1/09 02:51, Calogero Alex Baldacchino wrote:
>> eRDF might be a working compromise, because it doesn't need any changes
>> to the spec
>
> It's not possible to author conforming HTML5 that functions as eRDF 
> since eRDF requires a 'profile' attribute, but HTML5 has removed the 
> attribute.
>

I didn't noticed that before, thanks for the info :-)

However, actually it's the same for RDFa attributes, because they're not 
in the spec. From this point of view, introducing six new attributes, or 
resorting to an older one is not very different, thus (again) why RDFa 
and not eRDF? Or why not both? Or not also RDFa embedded in Atom 
embedded, in turn, in HTML (like SVG or MathML)? It seems to me, for 
instance, that at this stage SearchMonkey might be a reason to consider 
all of them.

>
> ; RDFa covers a wider range of RDF semantics, but requires
>> new attributes and also namespaces (a sort of hybrid beteween them might
>> avoid the need to bring namespaces - xmlns:* attributes - into html
>> serialization).
>
> To avoid xmlns:* attributes, one could drop CURIEs in the text/html 
> serialization and use markup like:
>
> <div>
>   <div about="http://dbpedia.org/resource/Albert_Einstein">
>     ...
>   </div>
> </div>
>
> instead of
>
> <div xmlns:db="http://dbpedia.org/">
>   <div about="[db:resource/Albert_Einstein]">
>     ...
>   </div>
> </div>
>
> There's no data loss.
>

Well, that's a chance, of course, but that's *not* RDFa as specified by 
W3C; for instance, @property is specified as accepting _only_ CURIEs 
(whereas @about can accept also URIs - and eRDF allows curies, even if 
in a different format than what specified for RDFa and what is used for 
XML in general). That is, to do that not one, but _two_ specifications 
need to be changed, current HTML5 (which is a draft, thus  not a 
problem) and RDFa (which now is a Recommendation, thus, might it be more 
difficoult? should a different specification be derived?), unless we 
want that to be just an unofficial, yet widely accepted, convention - 
and I think that an unofficial convention is worth the others (any 
processors conforming to standard RDFa would need deep changes to cope 
with that - it doesn't work in Fuzzbot when CURIEs are expected, for 
instance). I'm the first to say that my suggestion was an ugly hack, but 
at least it would have been working and conformant without changing 
anything.

>> My suggestion was meant as a mean to test RDFa in HTML
>> documents without changing the spec (perhaps in conjunction with
>> data-xmlns-*, data-xmlns-prefixes="rdfa foaf <whatever>" to "emulate"
>> namespaces - an ugly hack, I know, but at least would avoid changes to
>> html serialization, at least in a test phase) -- even if I think that
>> xml serialization should work better for such rdf metadata.
>
> I really can't see anybody violating the spec in that way rather than 
> violating the spec by just adding the RDFa attributes outright, --

Indeed, current specs are violated, and I was just considering a way to 
use RDFa without such violations before deciding if it's worth to be 
added to the spec, no more (and I don't want to push that hack anymore, 
just trying to point out my aim).

> --especially given that there are already people publishing these 
> attributes in text/html so the "namespace" has already been polluted 
> and we already have services like SearchMonkey not only using these 
> attributes but promoting them.

It seems to me that SearchMontky doesn't promote RDFa more than it 
promotes Microformats, eRDF and dataRSS (RDFa embedded in external Atom 
feeds). It's also a very recent feature, and I really can't guess which 
kind of RDF serialization is going to "win the battle" (that is, 
choosing one against the others *might* be a premature choice right now, 
as well as introducing all of them).

> It may therefore already be problematic for a future version of HTML 
> to use these attributes as extension points without breaking existing 
> sites. The "test" is already in progress, for better or worse. HTML5 
> conformance checkers don't have to bless this test, of course, any 
> more than CSS validators have to give the all clear to vendor-specific 
> properties.

It's the same with every possible existing custom (non-standard) 
attributes and elements out there, since there is no standard for them, 
and instead data-* has been created; it's also the same for accesskey, 
actually, since it's not in current spec (whereas it was in HTML4). 
After all, support for unknown attributes/elements has never been a 
standard "de jure", but more of a quirk, and there are no grants it will 
work fine in the future (as well as actually it doesn't work 
consistently for unknown elements cross-browsers -- there are strong 
differences between IE and other browsers with this respect).

Moreover, the use of such attributes /for the purposes of SearchMonkey/ 
is a very, very custom use case, since they're used just for server-side 
computations, thus no collaboration is required by other UAs; if 
browsers just ignored and dropped such attributes (as they do with 
unknown, proprietary CSS extensions), no page would be broken, whereas 
SearchMonkey would work as fine. Problems might arise if they were used 
in different contexts (e.g. as CSS selectors - but dropping unknown CSS 
rules is allowed by CSS spec), but who cares of them might just run a 
regex tool to map them to a new, standard-compliant version (given that, 
for instance, "data-rdfa-about", "rdfa:about" and "about" are in a 
1-to-1 correspondence, thus such might be done very easily by UAs as a 
quirk).

 From this point of view, SearchMonkey might use its own custom dataset 
and model without any changes to its functionalities (AIUI, the basic 
format for RDF metadata in SearchMonkey is dataRSS). Since there are 
standards for embedding RDF into (x)html documents, it just makes sense 
to support them all for Yahoo.

>
> Moreover, the damage done by immediately breaking the principle that 
> data-* should be for private use only and turning it into a 
> distributed extension point may be worse than the alternatives.
>
> -- 
> Benjamin Hawkes-Lewis

I really don't see the problem if a *custom* convention became widely 
accepted and reused by other people (given that my idea started from a 
Charls McCathieNevile's mail presenting small-scale scenarios, such as 
organizations' internal use and external interchange with other selected 
organizations, as a main context for RDFa - and I've never said HTML 
specification should even mention it, I was thinking to it just as an 
unofficial convention to experiment with in such scenarios).

I really can't get, right now, why it should be different, for instance, 
from the case of a freely reusable widget using a custom data model 
based on private data-* attributes inserted by people in thousands of 
websites (the widget with relitive metadata, I mean), then liked by 
other people and reused in different contexts (the same data model based 
on data-*, now), unless we agree this should be avoided, but I can't 
guess how to prevent people from reusing a "private-only" data model 
they happened to like (unless it resulted in a copyright infringment, 
but I'm not sure this may happen because of the mere use of the same 
name for some "variables" elaborated by a similar script, yet different 
in source code -- given that copyright is evaluated at source code 
level, not per the resulting functionalities, as far as I know).

WBR, Alex

 --
 Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP autenticato? GRATIS solo con Email.it http://www.email.it/f

 Sponsor:
 Partecipa al concorso Danone Activia e vinci MacBook Air e Nokia N96. Prova
 Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8552&d=11-1