[whatwg] HTML5 Parsing spec first draft ready
ian at hixie.ch
Wed Feb 15 16:12:58 PST 2006
On Wed, 15 Feb 2006, Dan Brickley wrote:
> * Ian Hickson <ian at hixie.ch> [2006-02-15 23:02+0000]
> > On Wed, 15 Feb 2006, Dan Brickley wrote:
> > >
> > > Have you considered defining the parser behaviour in terms of XML
> > > concepts?
> > What would that mean?
> > Could you give an example of what that would look like?
> Expressing things in terms of DOM would be one way, assuming
> there is a mapping to XML infoset from the DOM
Well in that case, it's done. The HTML5 Parser spec is a mapping from a
Unicode character stream to a DOM.
> > The output of the parser is a DOM, so the natural form to use as an
> > output concrete syntax is simply a serialised DOM (e.g. an XML file).
> If your DOM comes with a standard XMLization, we're golden. Sorry I'm
> not so up to date on DOM stuff (eg. which DOMs have an XMLization
> defined, etc.).
A DOM is a DOM is a DOM. (Well, except for SVG's crazy-ass uDOM nonsense,
but let's ignore that.) There are admittedly various ways of serialising a
DOM: some are naive and more predictable, but can, in edge, cases end up
with ill-formed markup; some are clever and less predictable, but always
generate well-formed markup. Any test suite system would have to define
its serialisation policy.
> > > GRDDL could then say "for HTML-ish bytestreams, feed them to the
> > > WHATWG algorithm to get XML, and feed that XML to normal GRDDL
> > > algorithm to get RDF"...
> > I'm with you up to the step where the output is XML, but I fail to see
> > how the next step is something WHATWG would be interested in. Could
> > you expand on this?
> The next step is for people who find value in RDF's abstract graph
> structure but find the standard RDF/XML syntax unattractive. GRDDL lets
> folk deploy using XML or XHTML-based formats of their own devising, but
> map into RDF using XSLT so that RDF tools (eg. databases, SPARQL query
> engines) can consume and exploit the data.
Ah. Well, HTML5 is defined in terms of a DOM, so GRDDL is presumably,
therefore, already supported.
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg