[whatwg] Encoding: big5 and big5-hkscs

Philip Jägenstedt philipj at opera.com
Fri Apr 6 14:03:22 PDT 2012


On Fri, 06 Apr 2012 15:42:26 +0200, Philip Jägenstedt <philipj at opera.com>  
wrote:

> These are the ranges that need more investigation.

Sorry for the monologue, but investigate I did. These are the interesting  
ones:

C6CF =>
opera-hk: U+FFFD �
firefox: U+5EF4 廴
chrome: U+F6DF 
firefox-hk: U+5EF4 廴
opera: U+2F35 ⼵
chrome-hk: U+FFFD �
internetexplorer: U+F6DF 

C6D3 =>
opera-hk: U+FFFD �
firefox: U+65E0 无
chrome: U+F6E3 
firefox-hk: U+65E0 无
opera: U+2F46 ⽆
chrome-hk: U+FFFD �
internetexplorer: U+F6E3 

C6D5 =>
opera-hk: U+FFFD �
firefox: U+7676 癶
chrome: U+F6E5 
firefox-hk: U+7676 癶
opera: U+2F68 ⽨
chrome-hk: U+FFFD �
internetexplorer: U+F6E5 

C6D7 =>
opera-hk: U+FFFD �
firefox: U+96B6 隶
chrome: U+F6E7 
firefox-hk: U+96B6 隶
opera: U+2FAA ⾪
chrome-hk: U+FFFD �
internetexplorer: U+F6E7 

C6DE =>
opera-hk: U+FFFD �
firefox: U+3003 〃
chrome: U+F6EE 
firefox-hk: U+3003 〃
opera: U+3003 〃
chrome-hk: U+FFFD �
internetexplorer: U+F6EE 

C6DF =>
opera-hk: U+FFFD �
firefox: U+4EDD 仝
chrome: U+F6EF 
firefox-hk: U+4EDD 仝
opera: U+4EDD 仝
chrome-hk: U+FFFD �
internetexplorer: U+F6EF 

The first 4 were Opera using a compatibility code point instead of the  
canonical one. The final 2 are PUA vs proper, at least they render the  
same on my computer. In all 6 cases, firefox and firefox-hk are correct.

I manually added the above 6 mappings and the 4 multi-code point mappings  
 from HKSCS-2008 to <https://gitorious.org/whatwg/big5>.

There are 29 mappings to U+003F (?) in IE that no other browser has. The  
remaining mappings are to PUA or U+FFFD in all browsers, which appears to  
simply be an artifact of the way the mapping is done internally. Mapping  
these to U+FFFD unless anyone finds pages using these byte sequences seems  
the only sane option.

So, <http://people.opera.com/philipj/2012/04/06/big5-foolip.txt> is the  
mapping I suggest, with 18594 defined mappings and 1188 U+FFFD.

-- 
Philip Jägenstedt
Core Developer
Opera Software



More information about the whatwg mailing list