[whatwg] DOMParser / XMLSerializer

Anne van Kesteren annevk at opera.com
Thu May 21 00:55:02 PDT 2009


On Thu, 21 May 2009 01:25:25 +0200, João Eiras <joaoe at opera.com> wrote:
> XMLSerializer must generate well formed xml (all tags closed, no  
> attributes without values, case preserved, etc) and it accepts a full  
> document, so you get a serialized output with doctype, processing  
> instructions, comments which are not descendants of the root, and the  
> root itself.

This is no different from innerHTML in XML documents.


> DOMParser parses xml into a full document so if I have a doctype subset,  
> those will be recognized and replaced on the document. That does not  
> happen with innerHTML. If the input source has processing instructions,  
> these will be preserved also in the result document.

It does happen with *document*.innerHTML.


On Thu, 21 May 2009 01:46:24 +0200, Jonas Sicking <jonas at sicking.cc> wrote:
> Mostly I think these APIs came about before innerHTML was supported in
> XML content. DOMParser is also somewhat more convenient if you want a
> full document back. And XMLSerializer is more convenient if you want
> an XML serialization of text/html content.

document.innerHTML also gives a full document back but yeah, you cannot get an XML serialization of an HTML document at this point I believe. Just an HTML serialization.


On Thu, 21 May 2009 05:23:40 +0200, Boris Zbarsky <bzbarsky at mit.edu> wrote:
> Ignoring the non-web-facing functionality (like parsing from and  
> serializing to streams), and speaking only of Gecko's implementations,  
> basically the following:
>
> 1)  DOMParser can parse as a given content type (in theory XML vs HTML;
>      I assume that if document.innerHTML doesn't do that yet it could be
>      changed to do so).

If the document is of the right type (HTML document vs XML document) it works fine.


> 2)  DOMParser can parse from a byte array instead of a string; this
>      makes it a little easier to work with XML in encodings other than
>      UTF-8 or UTF-16.

ECMASCript doesn't have byte arrays though. (Though it would be nice if it did.)


> 2)  XMLSerializer can serialize a subtree rooted at a given node without
>      removing the node from its current location in the DOM.

Isn't this true for innerHTML too?


-- 
Anne van Kesteren
http://annevankesteren.nl/


More information about the whatwg mailing list