[whatwg] HTML5 Parsing spec first draft ready

Ian Hickson ian at hixie.ch
Wed Feb 15 16:12:58 PST 2006


On Wed, 15 Feb 2006, Dan Brickley wrote:
>
> * Ian Hickson <ian at hixie.ch> [2006-02-15 23:02+0000]
> > On Wed, 15 Feb 2006, Dan Brickley wrote:
> > >
> > > Have you considered defining the parser behaviour in terms of XML 
> > > concepts?
> > 
> > What would that mean?
> > 
> > Could you give an example of what that would look like?
> 
> Expressing things in terms of DOM would be one way, assuming 
> there is a mapping to XML infoset from the DOM

Well in that case, it's done. The HTML5 Parser spec is a mapping from a 
Unicode character stream to a DOM.


> > The output of the parser is a DOM, so the natural form to use as an 
> > output concrete syntax is simply a serialised DOM (e.g. an XML file).
> 
> If your DOM comes with a standard XMLization, we're golden. Sorry I'm 
> not so up to date on DOM stuff (eg. which DOMs have an XMLization 
> defined, etc.).

A DOM is a DOM is a DOM. (Well, except for SVG's crazy-ass uDOM nonsense, 
but let's ignore that.) There are admittedly various ways of serialising a 
DOM: some are naive and more predictable, but can, in edge, cases end up 
with ill-formed markup; some are clever and less predictable, but always 
generate well-formed markup. Any test suite system would have to define 
its serialisation policy.


> > > GRDDL could then say "for HTML-ish bytestreams, feed them to the 
> > > WHATWG algorithm to get XML, and feed that XML to normal GRDDL 
> > > algorithm to get RDF"...
> > 
> > I'm with you up to the step where the output is XML, but I fail to see 
> > how the next step is something WHATWG would be interested in. Could 
> > you expand on this?
> 
> The next step is for people who find value in RDF's abstract graph 
> structure but find the standard RDF/XML syntax unattractive. GRDDL lets 
> folk deploy using XML or XHTML-based formats of their own devising, but 
> map into RDF using XSLT so that RDF tools (eg. databases, SPARQL query 
> engines) can consume and exploit the data.

Ah. Well, HTML5 is defined in terms of a DOM, so GRDDL is presumably, 
therefore, already supported.

HTH,
-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'



More information about the whatwg mailing list