[whatwg] Discrepancies between HTML and ES rules for parsing an integer or float
Jonas Sicking
jonas at sicking.cc
Fri Aug 5 17:12:54 PDT 2011
On Fri, Aug 5, 2011 at 8:43 AM, Aryeh Gregor <ayg at aryeh.name> wrote:
> On Fri, Aug 5, 2011 at 1:57 AM, Jonas Sicking <jonas at sicking.cc> wrote:
>> It would make sense to me to match ES here. The main concern is of
>> course website compat. Could someone detail what the differences would
>> be compared to what implementations/the HTML5 spec do now?
>
> As far as I know, the only difference between the HTML and ES
> algorithms is handling of non-ASCII whitespace: ES treats it as
> whitespace, HTML does not. Specifically, ES treats StrWhiteSpaceChar
> as leading whitespace:
>
> http://es5.github.com/#x15.1.2.2
>
> That includes any Unicode "space separator" (Zs), which in particular
> changes over time (which seems to be Hixie's main objection IIUC).
> HTML uses "skip whitespace":
>
> http://www.whatwg.org/specs/web-apps/current-work/multipage/common-microsyntaxes.html#signed-integers
>
> Which if you follow the breadcrumbs means only [ \t\n\r\f]. So it's
> almost never going to make any difference in practice, we're talking
> only about corner cases.
>
> I have a simple test-case at
> <http://www.w3.org/Bugs/Public/show_bug.cgi?id=12296#c4> that shows
> all browsers strip leading \x0b (vertical tab) when converting DOM
> attributes to ints, which matches ES and not HTML.
>
>> For parsing floats this would not seem like a problem though since
>> attributes containing floats is relatively new IIRC.
>
> Yes, that's correct. There's definitely no compat issue here with
> floats, but really there's not going to be any with ints either, since
> it's going to be exceedingly rare that anyone will put Unicode
> whitespace in DOM attributes that are reflected as integers and then
> rely on them working. So it's just a question of if we'd prefer the
> algorithms to match or not.
Sounds good. I'm for such a change yes.
/ Jonas
More information about the whatwg
mailing list