[whatwg] Form-associated elements and the parser

Ryosuke Niwa rniwa at apple.com
Wed Nov 20 20:49:04 PST 2013


There is a related quirk with respect to the isindex element.

A start tag whose tag name is "isindex"
http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#isindex

Right now, form element pointer is not null in the following example, so we end up losing isindex element entirely.
<!DOCTYPE html>
<html><body><form><template>a<isindex></isindex>b</template>

Granted, isindex is a legacy element but it seems better to keep things consistent.

- R. Niwa

On Aug 13, 2013, at 7:08 AM, Adam Klein <adamk at chromium.org> wrote:

> On Tue, Aug 6, 2013 at 4:47 PM, Adam Klein <adamk at chromium.org> wrote:
>> On Tue, Aug 6, 2013 at 4:38 PM, Jonas Sicking <jonas at sicking.cc> wrote:
>>> On Tue, Aug 6, 2013 at 4:27 PM, Adam Klein <adamk at chromium.org> wrote:
>>>> On Tue, Aug 6, 2013 at 4:21 PM, Jonas Sicking <jonas at sicking.cc> wrote:
>>>>> As I recall it (it was ages since I dealt with this), the tricky case
>>>>> that you need to handle is this one:
>>>>> 
>>>>> http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2432
>>>>> 
>>>>> In this case, web compatibility requires that the <input> is
>>>>> associated with the form. Specifically hidden <input> elements would
>>>>> often end up moved, but still had to show up in form.elements as well
>>>>> as get submitted along with the form.
>>>> 
>>>> That case definitely makes sense to me, and I think it's fine to keep
>>>> that behavior for compat. The only one I'm asking to change is the
>>>> case when the <input> and <form> end up in different trees.
>>> 
>>> Sure, as long as you come up with a formalized algorithm for when
>>> there is an association and when there isn't. Keep in mind that by the
>>> time that the input-element is inserted, the form-element might have
>>> been moved elsewhere. We likely don't need the association in that
>>> case, but detecting that that has happened sounds tricky.
>> 
>> My concrete proposal would be something like this:
>> 
>> In step 4 of http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#create-an-element-for-the-token,
>> add a requirement that "intended parent" and the "form element
>> pointer" be part of the same "home subtree" (defined at
>> http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#home-subtree).
> 
> For what it's worth, we're giving this a try in Blink
> (https://src.chromium.org/viewvc/blink?revision=155949&view=revision),
> as it's by far the safest fix for the related crashes. I'll update
> this thread if we run into any compat issues in the wild (or if we
> don't!).
> 
> - Adam
> 
>>> The way that Gecko currently works IIRC is that it creates the
>>> association any time it has seen a "<form>" without seeing a
>>> "</form>". And it breaks the association anytime an input-element's
>>> parent chain changes and the associated form-element is no longer in
>>> the parent chain.
>> 
>> This is basically the same thing Blink & WebKit do, with the caveat
>> that we also avoid associating <form>s with elements inside
>> <template>s (this is now reflected in step 4 of the algorithm, see
>> above).
>> 
>>> On a related note, when are you guys going to add a cycle collector or
>>> other not-plain-refcounting memory manager :-)
>> 
>> Yes, that would be nice :)
>> 
>> - Adam
>> 
>>> / Jonas
>>> 
>>>>> On Tue, Aug 6, 2013 at 2:01 PM, Adam Klein <adamk at chromium.org> wrote:
>>>>>> Hixie opened my eyes last week to parser-association behavior of the
>>>>>> sort found at http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2428.
>>>>>> In that case, an <input> in a detached tree is associated with a
>>>>>> <form> in the main document. This causes badness in WebKit and Blink
>>>>>> because the association between the <form> and the <input> (e.g., as
>>>>>> exposed in the HTMLFormElement.elements collection) is only weakly
>>>>>> held to avoid reference loops (and thus memory leaks). And that
>>>>>> weakness occasionally results in crashes when one of these objects is
>>>>>> collected before the other.
>>>>>> 
>>>>>> While all modern HTML parser implementations I tested seemed to agree
>>>>>> on their treatment of the above example (they all return "1" as
>>>>>> elements.length), this feature doesn't strike me as terribly useful.
>>>>>> And for what it's worth, it doesn't seem to be present in legacy IE.
>>>>>> 
>>>>>> I'm interested what others would think about changing the parser to
>>>>>> only associate a <form> with an <input> if both are in the same "home
>>>>>> subtree" (http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#home-subtree).
>>>>>> Or is there some deep web-compat reason for this parsing oddity?
>>>>>> 
>>>>>> - Adam



More information about the whatwg mailing list