[whatwg] [WA1] INS/DEL and omitted tags

Fri Nov 25 01:47:34 PST 2005

Ian Hickson wrote:
> On Thu, 24 Nov 2005, Simon Pieters wrote:
>> <p>foo<ins><p>bar</ins>

>>Opera/9.0 (Windows NT 5.1; U; en)
>><P>foo<INS></INS></P><P>bar</P>
>>
>>Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20051120
>>Firefox/1.6a1
>><p>foo<ins><p>bar</p></ins></p>
> 
> And Safari does what Opera does, which is why it's correct. If either 
> Opera or Safari changed to match what Mozilla does, then that would be 
> correct instead. :-)
> 
> Basically, when the parsing section gets written, it'll be written to 
> match the behaviour that the most browsers do.

I think the Mozilla behavior is easier to implement because it 
doesn't require knowledge about the DTD. The Opera behavior cannot 
be implemented without having the knowledge that an ins element 
cannot contain a p element. Should the UA also contain knowledge 
about every other element so that it can correctly close all open 
inline elements when it sees a tag that starts a block level 
element? (Such behavior would be closer to W3C spec but it would 
result in problems with real world content.) I have to admit that 
the Opera's method results in DTD conforming structure whereas 
Mozilla puts the block level p element inside the inline ins element.

Mozilla's behavior could be considered as a one giving more weight 
to closing tags than opening tags but a one without knowledge about 
the actual language used. The example
foo<ins>bar</ins>
can be parsed *without* knowledge of DTD as follows:
(the "*" marks the parser position)
foo<ins>bar*</ins>
the parser sees </ins> when the element stack is "p ins p". It 
implements the logic that since the stack contains "ins" element, 
this closing tag should be matched with the topmost element in the 
stack. The p element should be closed first so an implied tag 
should be inserted resulting to stack "p ins" which can then be 
matched with the </ins> in the code.
foo<ins>bar</ins>*
EOF but the stack looks like "p". An implied "" tag should be 
inserted. So we get parsed structure
foo<ins>bar</ins>
which looks like the one Mozilla generates.

I'd prefer the Mozilla way as the "official" because it's simpler to 
implement but Opera/Safari is more correct because it follows the 
specification (DTD) more closely. However, the Opera/Safari way has 
the problem that if the input is well-formed XML, it results to a 
different tree than with an XML parser. This could make change from 
HTML to XHTML harder in the future.

-- 
Mikko

[whatwg] [WA1] INS/DEL and omitted </p> tags