[whatwg] iso-2022-jp and octets over 0x7E

Anne van Kesteren annevk at opera.com
Sun Jan 8 11:49:56 PST 2012

On Sun, 08 Jan 2012 15:32:47 +0100, Anne van Kesteren <annevk at opera.com>  
> On Sun, 08 Jan 2012 01:37:14 +0100, NARUSE, Yui <naruse at airemix.jp>  
> wrote:
>> == iso-2022-jp
>> === The to Unicode algorithm
>> ==== Based on iso-2022-jp state
>> ===== ASCII state
>> ====== Based on octet:
>> ======= Otherwise
>>> If the fatal flag is set, return failure.
>>> Otherwise, emit the fallback code point.
>> Just FYI, IE and Opera show these bytes as Katakana.
>> If octet is greater than 0xA0 and less than 0xE0, value is octet +  
>> 0xFEC0.
>> Moreover IE shows any shift_jis characters here.
>> It seems that IE uses the same converter both iso-2022-jp and shift_jis.
> I have filed a bug on Opera to become more strict like Webkit/Gecko. If  
> there is some evidence that approach is wrong though, we can turn it  
> around.

So just to be sure I checked again and in Opera you can only get the  
"special" single-octet behavior if you active a particular state first. If  
you are in ASCII, Opera will simply emit the octet unless it is 0x1B (ESC)  
so maybe there is a system font that does something special for those  
characters? Or maybe you meant something else?

Anne van Kesteren

More information about the whatwg mailing list