[whatwg] Question about the application/x-www-form-urlencoded encoding algorithm

Ian Hickson ian at hixie.ch
Fri Jul 23 16:11:03 PDT 2010


On Sun, 21 Mar 2010, NARUSE, Yui wrote:
> (2010/01/21 16:29), NARUSE, Yui wrote:
> > In 4.10.19.4 URL-encoded form data, The
> > application/x-www-form-urlencoded encoding algorithm,
> > it says:
> > 
> > > For each character in the entry's name and value, apply the following
> > > subsubsteps:
> > > 
> > > If the character isn't in the range U+0020, U+002A, U+002D, U+002E,
> > > U+0030 to U+0039, U+0041 to U+005A, U+005F, U+0061 to U+007A
> > > then replace the character with a string formed as follows:
> > > Start with the empty string, and then, taking each byte of the character
> > > when expressed in the selected character encoding in turn,
> > > append to the string a U+0025 PERCENT SIGN character (%) followed
> > > by two characters in the ranges U+0030 DIGIT ZERO (0) to
> > > U+0039 DIGIT NINE (9) and U+0041 LATIN CAPITAL LETTER A
> > > to U+0046 LATIN CAPITAL LETTER F representing the hexadecimal value
> > > of the byte (zero-padded if necessary).
> > > 
> > > If the character is a U+0020 SPACE character, replace it with a single
> > > U+002B PLUS SIGN character (+).
> > 
> > This means, U+9670, encoded as "\x89\x41" in Shift_JIS, must be
> > encoded as "%89%41",
> > and shouldn't be "%89A"?
> 
> The spec is read that
> "\x89\x41" in Shift_JIS should be encoded as "%89%41".
> But current impplementations encode it as "%89A".
> (I tested IE, Firefox, Opera, Chrome)
> 
> So this should be a bug of the spec.

This is now fixed in the spec, by the way.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


More information about the whatwg mailing list