[whatwg] Editorial: ASCII case-insensitive string comparison

Øistein E. Andersen liszt at coq.no
Sat May 12 05:47:02 PDT 2012


When I read Anne van Kesteren's Encoding specification recently, I came across the following definition, borrowed from HTML5:

> Comparing two strings in an ASCII case-insensitive manner means comparing them exactly, code point for code point, except that the characters in the range U+0041 to U+005A (i.e. LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z) and the corresponding characters in the range U+0061 to U+007A (i.e. LATIN SMALL LETTER A to LATIN SMALL LETTER Z) are considered to also match.


The construction ‘are considered to also match’ seems awkward here since the intended meaning is clearly not that the characters match in addition to doing something else like in ‘I don’t just want you to laugh but to also sing along’ or ‘our face/tongue system allow[s] us to talk and eat—but also to sing and act’.

The most natural place for ‘also’ is probably in front of ‘considered’ (yielding ‘are also considered to match’).

(Another solution would be to remove the need for ‘also’ by rewriting the phrase, for instance to something like ‘except that the characters in the range U+0041 to U+005A ([...] A to [...] Z) are considered equivalent to the corresponding characters in the range U+0061 to U+007A ([... a] to [... z])’.)

Øistein E. Andersen


More information about the whatwg mailing list