[whatwg] Writing authoring tools and validators for custom microdata vocabularies

Adrian Walker adriandwalker at gmail.com
Wed May 20 04:10:35 PDT 2009

Ian --

There's an authoring and reasoning tool that you may like to evaluate for
your purposes.

A possible advantage of the tool for your tasks is that it explains its
answers, step by step, in hypertexted English.

The vocabulary for writing rules in executable English is open, but the tool
can reason about controlled vocabularies, inheritance, and so on.  It covers
some examples that are not do-able in OWL.

The tool is online at the site below.

Apologies if you have seen this before, and thanks for comments.

                                                    -- Adrian

Internet Business Logic
A Wiki and SOA Endpoint for Executable Open Vocabulary English over SQL and
Online at www.reengineeringllc.com    Shared use is free

Adrian Walker

On Tue, May 19, 2009 at 9:36 PM, Ian Hickson <ian at hixie.ch> wrote:

> One of the use cases I collected from the e-mails sent in over the past
> few months was the following:
>   USE CASE: It should be possible to write generalized validators and
>   authoring tools for the annotations described in the previous use case.
>     * Mary would like to write a generalized software tool to help page
>       authors express micro-data. One of the features that she would like
> to
>       include is one that displays authoring information, such as
> vocabulary
>       term description, type information, range information, and other
>       vocabulary term attributes in-line so that authors have a better
>       understanding of the vocabularies that they're using.
>     * John would like to ensure that his indexing software only stores
>       type-valid data. Part of the mechanism that he uses to check the
>       incoming micro-data stream is type information that is embedded in
> the
>       vocabularies that he uses.
>     * Steve, would like to provide warnings to the authors that use his
>       vocabulary that certain vocabulary terms are experimental and may
>       never become stable.
>     * There should be a definitive location for vocabularies.
>     * It should be possible for vocabularies to describe other
> vocabularies.
>     * Originating vocabulary documents should be discoverable.
>     * Machine-readable vocabulary information shouldn't be on a separate
>       page than the human-readable explanation.
>     * There must not be restrictions on the possible ways vocabularies can
>       be expressed (e.g. the way DTDs restricted possible grammars in
> SGML).
>     * Parsing rules should be unambiguous.
>     * Should not require changes to HTML5 parsing rules.
> I couldn't find a good solution to this problem.
> The obvious solution is to use a schema language, such as RDFS or OWL.
> Indeed, that's probably the only solution that I can recommend. However,
> as we discovered with HTML5, schema languages aren't expressive enough. I
> wouldn't be surprised to find that no existing schema could accurately
> describe the complete set of requirements that apply to the vCard, vEvent,
> and BibTeX vocabularies (though I haven't checked if this is the case).
> For any widely used vocabulary, I think the best solution will be
> hard-coded constraints and context-sensitive help systems, as we have for
> HTML5 validators and HTML editors.
> For other vocabularies, I recommend using RDFS and OWL, and having the
> tools support microdata as a serialisation of RDF. Microdata itself could
> probably be used to express the constraints, though possibly not directly
> in RDFS and OWL if these use features that microdata doesn't currently
> expose (like typed properties).
> Regarding some of the requirements, I actually disagree that they are
> desireable. For example, having a definitive location for vocabularies has
> been shown to be a bad idea for scalability, with the W3C experiencing
> huge download volume for certain schemas. Similarly, I don't think that
> the "turtles all the way down" approach of describing vocabularies using
> the same syntax as the definition is about (self-hosted schemas) is
> necessary or, frankly, particularly useful to the end-user (though it may
> have nice theoretical properties).
> In conclusion: I recommend using an existing RDF-based schema language in
> conjunction with the mapping of microdata to RDF. Implementation
> experience with how this actually works in practice in end-user schenarios
> would be very useful in determining if something more is needed here.
> --
> Ian Hickson               U+1047E                )\._.,--....,'``.    fL
> http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
> Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20090520/3d1ee205/attachment-0002.htm>

More information about the whatwg mailing list