[whatwg] Augmenting HTML parser to recognize new elements

Wed Jan 18 14:00:30 PST 2012

On Wed, Jan 18, 2012 at 1:55 PM, Dimitri Glazkov <dglazkov at chromium.org> wrote:
> On Wed, Jan 18, 2012 at 1:47 PM, Adam Barth <w3c at adambarth.com> wrote:
>> On Wed, Jan 18, 2012 at 1:29 PM, Dimitri Glazkov <dglazkov at chromium.org> wrote:
>>> On Wed, Jan 18, 2012 at 1:14 PM, Dimitri Glazkov <dglazkov at chromium.org> wrote:
>>>> Ah, that's a good question. This also must be specified. It should
>>>> depend on the parent of the <content> element. If the parent is shadow
>>>> root or <table>, then it should make <tr> the child of <content>.
>>>> Otherwise, it should use foster parenting as usual.
>>>
>>> Oops, not "foster parenting", but "ignore" as you mentioned. Still
>>> getting through the details of the parsing spec.
>>
>> There's also some subtly w.r.t. the pending character tokens.
>>
>> More generally, I think we'd all be much more sane if the HTML parsing
>> algorithm was specified in the HTML living standard rather than
>> modified ad-hoc in a number of different documents.
>
> That makes sense, but how will we handle the fact that the elements in
> the algorithm aren't part of the HTML specification?

Through the magic of legacy support, that's already the case today!
(I'm looking at you <xmp>.)

The parsing algorithm just says how to construct a DOM.  You can have
all sorts of crazy futuristic/obsolete elements in the DOM.

Adam

>>>> On Wed, Jan 18, 2012 at 10:58 AM, Ryosuke Niwa <rniwa at webkit.org> wrote:
>>>>> What if content wrapped elements ignored by the parser. e.g.
>>>>> <content><tr>hi</tr></content>
>>>>>
>>>>> What should the content element include in that case?
>>>>>
>>>>> - Ryosuke
>>>>>
>>>>> On Jan 18, 2012 10:19 AM, "Dimitri Glazkov" <dglazkov at chromium.org> wrote:
>>>>>>
>>>>>> 'sup, Whatwg!
>>>>>>
>>>>>> The new HTML elements in the shadow DOM spec
>>>>>> (http://dvcs.w3.org/hg/webcomponents/raw-file/tip/spec/shadow/index.html)
>>>>>> and the nascent HTML templates spec (see it all explained here:
>>>>>> http://dvcs.w3.org/hg/webcomponents/raw-file/tip/explainer/index.html)
>>>>>> require tweaking of the HTML parsing behavior -- mostly the tree
>>>>>> construction bits.
>>>>>>
>>>>>> A typical example would be specifying an insertion point (that's
>>>>>> <content> element) as child of a <table>:
>>>>>>
>>>>>> <table>
>>>>>>    <content>
>>>>>>        <tr>
>>>>>>            ...
>>>>>>        </tr>
>>>>>>    </content>
>>>>>> </table>
>>>>>>
>>>>>> Both <shadow> and <template> elements have similar use cases.
>>>>>>
>>>>>> What would be the sane way to document such changes to the HTML parser
>>>>>> behavior? A list of modifications to tree construction modes in each
>>>>>> respective spec? Some "generic insertion point element" clause in the
>>>>>> HTML spec? Give me ideas.
>>>>>>
>>>>>> Also -- what are the side effects of such a change? Surely, there's
>>>>>> something I am not thinking of.
>>>>>>
>>>>>> :DG<