[whatwg] Allowed characters in attribute names (was: Re: Steps for finding one or two numbers in a string)

Simon Pieters zcorpan at gmail.com
Tue Jun 12 18:02:54 PDT 2007

On Tue, 12 Jun 2007 22:28:52 +0200, Kristof Zelechovski  
<giecrilj at stegny.2a.pl> wrote:

> The specification enumerates all accepted element attributes.  Neither of
> them transgresses ASCII boundaries.  Since it can be directly inferred  
> from
> the text, the explicit statement about that
> <http://www.whatwg.org/specs/web-apps/current-work/#attributes0>  
> technically
> is not needed, although it does no harm either.

"Any (namespace-less) attribute may be specified on the embed element."
   -- http://www.whatwg.org/specs/web-apps/current-work/#the-embed

Since attribute names that use characters outside ASCII aren't parse  
errors, and any attribute is allowed on the embed element, the definition  
of "Attribute names" in #writing is incorrect.

I would suggest to change the definition in #writing to say that attribute  
names can consist of any characters except whitespace, =, >, / and <.

Although that isn't quite right either. The parsing section allows  
attributes to begin with =. Given the following markup:

    <a =="">

Safari, Opera and Firefox drop the attribute. IE has an attribute with the  
name being the empty string and the value being ="". The HTML5 parsing  
spec says that there should be an attribute with the name = and the value  
the empty string. The "Before attribute name state" part of the parsing  
spec might have to be revisited.

Simon Pieters

More information about the whatwg mailing list