[Imps] [whatwg] Standard DOM Serialization? [was :Common Subset]

Mon Dec 11 15:18:46 PST 2006

James Graham wrote:
> Sam Ruby wrote:
>> Henri Sivonen wrote:
>>> On Dec 11, 2006, at 17:29, Henri Sivonen wrote:
>>>
>>>>   * For element start, write "<" followed by the element name,
>>>> followed by attributes, followed by ">".
>>>  * For element end, write "</" followed by the element name, followed 
>>> by ">".
>>
>> Don't do that for elements which are always empty.
>>
>> Also, as Anne pointed out, there is a precise serialization defined 
>> for innerHtml.
> 
> So, does it make any sense to use the innerHTML serialization, along 
> with a similar json-based format as is used for the tokenizer tests for 
> the parser unittests. Something like:
> 
> {"tests":
> [
> 
> {"description":"test description",
> "input":"some input string",
> "errors":[list of parse errors],
> "output":"innerhtml for resulting DOM"}
> 
> ]
> }
> 
> Since different implementations will report errors differently, I would 
> imagine a common testsuite would be useful for checking the correct 
> number of errors are produced. The big disadvantage of this format is 
> the fact that the innerHTML string will have to be both HTML escaped and 
> json escaped, leading to a high number of noise characters.

I would presume that parsing and serialization would have separate unit 
tests.  The edge cases actually are quite different for each. 
Furthermore, there may actually be DOM trees that can not be constructed 
via a parse.

Perl's XML::Parser::Style::Tree describes a JSON like serialization of a 
DOM that might be suitable as an 'input' to a serialization test:

http://search.cpan.org/dist/XML-Parser/Parser/Style/Tree.pm

- Sam Ruby