[whatwg] Getting .innerHTML in XML well-formedness issues
zcorpan at gmail.com
Tue Jul 10 08:09:45 PDT 2007
On Fri, 15 Jun 2007 01:02:49 +0200, Ian Hickson <ian at hixie.ch> wrote:
> On Fri, 27 Oct 2006, Simon Pieters wrote:
>> The spec says that getting .innerHTML in XML must return a
>> namespace-well-formed XML representation of the element or document. 
>> But what should happen when the DOM isn't namespace-well-formed and it
>> can't be fixed by namespace prefix rewriting?
>> E.g., when the DOM contains any of the following?:
>> * A ProcessingInstruction node containing ?>
>> * A Comment node containing -- (or ending with -)
>> * A CDATASection node containing ]]>
>> [ * A processing instruction with the target "xml"
>> (in any case combination)? ]
>> [ * Or colons in local names or processing instruction targets? ]
> ...or a DOCTYPE whose publicId or systemId parts contain both " and '
> I've made the spec say that you raise an exception in those six cases.
>> DOM3 Core says that they "must generate a fatal error during
>> serialization" (or, for the CDATA case, "the cdata section must be
>> splitted before the serialization"). Does that mean raise a SYNTAX_ERR
> I used INVALID_STATE_ERR, not SYNTAX_ERR (it's the reverse of a syntax
>> What about when there are illegal characters?
> The DOM doesn't let you create those cases.
Sure it does. e.g. the DOM allows e.g. control characters in various
places that XML doesn't. I haven't looked into every production in XML to
see if it differs from the DOM, but I guess you can spec something that is
catch-all, like "if the node contains a character that isn't allowed
according to the corresponding XML production" or some such... though
listing all cases is nicer.
> I'm tempted to allow the serialisation of PIs with the name "xml", and to
> allow the splitting of CDATA blocks with ]]>. Opinions?
The former wouldn't result in well-formed XML, but the latter is cool.
More information about the whatwg