[whatwg] Question about the application/x-www-form-urlencoded encoding algorithm

NARUSE, Yui naruse at airemix.jp
Sun Mar 21 05:27:09 PDT 2010


(2010/01/21 16:29), NARUSE, Yui wrote:
> In URL-encoded form data, The
> application/x-www-form-urlencoded encoding algorithm,
> it says:
>> For each character in the entry's name and value, apply the following subsubsteps:
>> If the character isn't in the range U+0020, U+002A, U+002D, U+002E,
>> U+0030 to U+0039, U+0041 to U+005A, U+005F, U+0061 to U+007A
>> then replace the character with a string formed as follows:
>> Start with the empty string, and then, taking each byte of the character
>> when expressed in the selected character encoding in turn,
>> append to the string a U+0025 PERCENT SIGN character (%) followed
>> by two characters in the ranges U+0030 DIGIT ZERO (0) to
>> to U+0046 LATIN CAPITAL LETTER F representing the hexadecimal value
>> of the byte (zero-padded if necessary).
>> If the character is a U+0020 SPACE character, replace it with a single U+002B PLUS SIGN character (+).
> This means, U+9670, encoded as "\x89\x41" in Shift_JIS, must be
> encoded as "%89%41",
> and shouldn't be "%89A"?

The spec is read that
"\x89\x41" in Shift_JIS should be encoded as "%89%41".
But current impplementations encode it as "%89A".
(I tested IE, Firefox, Opera, Chrome)

So this should be a bug of the spec.

NARUSE, Yui  <naruse at airemix.jp>

