[whatwg] several messages about serialising HTML and related subjects
ian at hixie.ch
Thu Feb 28 18:23:03 PST 2008
Executive summary: I did most of the changes suggested below.
On Wed, 15 Aug 2007, Simon Pieters wrote:
> The spec says:
> Other nodes types (e.g. Attr) cannot occur as children of elements. If
> they do, this algorithm must raise an INVALID_STATE_ERR exception.
> s/elements/elements or documents/ as the algorithm can be used for documents
> as well.
> What about PIs? They can occur as children of elements or documents.
On Wed, 15 Aug 2007, Simon Pieters wrote:
> The serializing HTML fragments algorithm talks about "child node" to
> refer to the current node being processed. This is a bit confusing, and
> I think "current node" would be clearer.
On Thu, 16 Aug 2007, Lachlan Hunt wrote:
> There is a possible issue serialising HTML fragments section . The
> algorithm seems fine for use with things like innerHTML, but there are
> other issues that should be considered when serialising to a file,
> database, network stream or something.
> Such serialisers should consider the character encoding. Although a
> Unicode encoding should ideally be used, some serialisers may need to
> serialise to a different encoding at the request of the user or
> limitations of the environment. In such cases, the serialisation should
> output appropriate character references for characters that can't be
> It should also handle outputting the appropriate <meta charset="">
> and/or BOM, especially in environments that can't declare it at the
> transport level like HTTP can.
> Perhaps the spec should say something about this issue somehwhere.
>  http://www.whatwg.org/specs/web-apps/current-work/#serialising
The section is specifically for serialising a subtree to a Unicode stream
without mutation, not to a byte stream. What's the use case that isn't
covered by "8.1 Writing HTML documents"?
On Mon, 27 Aug 2007, Simon Pieters wrote:
> IE7 and Firefox serialize U+00A0 characters in data and attribute values
> as " " when getting innerHTML. Safari and Opera don't. Should the
> spec be aligned with IE7 and Firefox here?
I don't see any great benefit to doing so; do any pages require this?
On Tue, 28 Aug 2007, Alexey Proskuryakov wrote:
> This has caused a compatibility issue for WebKit at least once. In
> that case, we got away with evangelizing, but we still track this as a
> bug that needs to be fixed eventually.
Ah. Ok then. Done.
On Tue, 28 Aug 2007, Boris Zbarsky wrote:
> For what it's worth, the relevant Mozilla bugs are
> https://bugzilla.mozilla.org/show_bug.cgi?id=165686 and
On Tue, 11 Sep 2007, Simon Pieters wrote:
> Consider the following document:
> <h:p xmlns:h="http://www.w3.org/1999/xhtml"><x/></h:p>
> When getting innerHTML on the root element, should the serialization
> declare the no namespace explicitly as in <x xmlns=""/>? (I think it
> should because setting innerHTML will imply namespace declarations so it
> might change meaning if you insert it somewhere else with innerHTML.)
I've added this:
| If any of the elements in the serialisation are in the null namespace,
| the default namespace in scope for those elements must be explicitly
| declared as the empty string.
Is that ok?
> Also, the spec says:
> In an XML context, the innerHTML DOM attribute on HTMLElements and
> HTMLDocuments, on getting, must return a string in the form of an
> internal general parsed entity [...]
> ...and then goes on to say that some DocumentType nodes must raise an
> exception, however internal general parsed entities can't have doctypes
> in the first place.
Oops. Fixed. Only elements should return internal general parsed entities;
documents should return document entities. Empty documents now raise an
> Finally, the spec lists the following as something that throws:
> A Text node whose data contains characters that are not matched by the
> XML Char production. [XML]
> But Text data is not the only case that might not match the Char
> production in XML. Comment data, CDATASection data,
> ProcessingInstruction target, and, I think, Attr value.
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg