[whatwg] Getting .innerHTML in XML well-formedness issues

Thu Jun 14 16:02:49 PDT 2007

On Fri, 27 Oct 2006, Simon Pieters wrote:
> 
> The spec says that getting .innerHTML in XML must return a 
> namespace-well-formed XML representation of the element or document. [1] 
> But what should happen when the DOM isn't namespace-well-formed and it 
> can't be fixed by namespace prefix rewriting?
> 
> E.g., when the DOM contains any of the following?:
> 
>   * A ProcessingInstruction node containing ?>
>   * A Comment node containing -- (or ending with -)
>   * A CDATASection node containing ]]>
> [ * A processing instruction with the target "xml"
>     (in any case combination)? ]
> [ * Or colons in local names or processing instruction targets? ]

...or a DOCTYPE whose publicId or systemId parts contain both " and ' 
characters.

I've made the spec say that you raise an exception in those six cases.

> DOM3 Core says that they "must generate a fatal error during 
> serialization" (or, for the CDATA case, "the cdata section must be 
> splitted before the serialization"). Does that mean raise a SYNTAX_ERR 
> exception?

I used INVALID_STATE_ERR, not SYNTAX_ERR (it's the reverse of a syntax 
error).

> What about when there are illegal characters?

The DOM doesn't let you create those cases.

I'm tempted to allow the serialisation of PIs with the name "xml", and to 
allow the splitting of CDATA blocks with ]]>. Opinions?

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'