[whatwg] Editorial: ASCII case-insensitive string comparison

Alex Bishop alexbishop at gmail.com
Sat May 12 11:31:27 PDT 2012

On 12/05/2012 13:47, Øistein E. Andersen wrote:
> When I read Anne van Kesteren's Encoding specification recently, I
> came across the following definition, borrowed from HTML5:
>> Comparing two strings in an ASCII case-insensitive manner means
>> comparing them exactly, code point for code point, except that the
>> characters in the range U+0041 to U+005A (i.e. LATIN CAPITAL LETTER
>> A to LATIN CAPITAL LETTER Z) and the corresponding characters in
>> the range U+0061 to U+007A (i.e. LATIN SMALL LETTER A to LATIN
>> SMALL LETTER Z) are considered to also match.
> The construction ‘are considered to also match’ seems awkward here
> since the intended meaning is clearly not that the characters match
> in addition to doing something else like in ‘I don’t just want you to
> laugh but to also sing along’ or ‘our face/tongue system allow[s] us
> to talk and eat—but also to sing and act’.

Sure they do. They match in addition to the usual matching rules (i.e. 
"comparing them exactly, code point for code point").

A pair of characters match if they have the same code point. A pair of 
characters also match if one is an ASCII upper-case character and the 
other is the equivalent ASCII lower-case character (or vice-versa).


Alex Bishop
alexbishop at gmail.com

More information about the whatwg mailing list