[whatwg] Namespaces and tag names in the HTML parser

Peter Occil poccil14 at gmail.com
Thu Aug 1 13:25:48 PDT 2013


Sec. 12.2.4 (Tokenization) doesn't contain ambiguous "so-and-so element" 
wordings;
it involves not elements but tag tokens, which are not yet assigned to a 
namespace.

Secs. 12.2.3 (Parse state) contains definitions used only in the tree 
construction stage. And
some of these definitions contain phrases like "If node is a select 
element", "If node is a tbody,
thead, or tfoot element," that follow one of the patterns given in the 
previous message
for checking whether an element has a certain name.  Section 12.3 
(Serializing HTML fragments)
contains phrases like that as well (e.g. "If the node is a template 
element", "If it is a style, xmp,
iframe, noembed, or noframes element"), and they are also subject to the 
same ambiguity
issues.

As for section 12.2.6, step 3 refers to "scripts" (styled orange in the 
spec) which can be either
HTML scripts or SVG scripts.  Since apparently SVG scripts are processed 
independently from
HTML scripts, it seems that (for clarity) this step ought to use the term 
"HTML scripts" accordingly,
rather than just "scripts", since only HTML scripts are added to the "list 
of scripts that will execute
when the document has finished parsing".

Section 12.2.8 is not normative, so ambiguity issues there are moot.

The rest of section 12.2 doesn't discuss "so-and-so elements" at all.

Section 12.1 contains requirements on "documents, authoring tools, and 
markup generators",
not on parsers or user agents, so while this section contains "so-and-so 
element" wordings,
any ambiguity issues they may have are rather benign in my opinion.

--Peter

-----Original Message----- 
From: Ian Hickson
Sent: Thursday, August 01, 2013 3:36 PM
To: Peter Occil
Cc: WHATWG
Subject: Re: Namespaces and tag names in the HTML parser

On Thu, 1 Aug 2013, Peter Occil wrote:
>
> Many of these cases occur in the normative portion of the tree
> construction stage.  Most of them involve checking whether an element
> (as opposed to a tag token) has a certain name:
>
> Accordingly, these cases are ambiguous: [...]

Thanks for listing these.

> As you can see, it's really only a few dozen ambiguous cases, not
> thousands.

Why are the ones not in the parser part of the spec not ambiguous?

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.' 




More information about the whatwg mailing list