[whatwg] Encoding: big5 and big5-hkscs

Anne van Kesteren annevk at opera.com
Wed Mar 28 06:36:35 PDT 2012

On Wed, 28 Mar 2012 12:18:41 +0200, Anne van Kesteren <annevk at opera.com>  
> I'm not sure what to do with big5 and big5-hkscs. After generating all  
> possible byte sequences (lead bytes 0x81 to 0xFE, trail bytes 0x40 to  
> 0x7E and 0xA1 to 0xFE) and getting the code points for those in various  
> browsers there does not seem to be that much interoperability.
> http://html5.org/temp/big5.json has all the code points for Internet  
> Explorer ("internetexplorer", same for big5 and hkscs), Firefox  
> ("firefox" and "firefox-hk"), Opera ("opera" and "opera-hk"), and Chrome  
> ("chrome" and "chrome-hk"). "internetexplorer" and "chrome" are quite  
> close, the rest is a little further apart.
> Some help as to how best to proceed would be appreciated.

To give some more context, IE treats big5 and big5-hkscs identical. Out of  
the total 19782 code points, 6217 of them map to the Private Use Area  
(PUA) in IE. Chrome does the same for big5, but has a different mapping  
for big5-hkscs. To deal with HKSCS Microsoft brought out this patch:  
http://www.microsoft.com/hk/hkscs/ Basically people living in the Hong  
Kong area are expected to have that installed and therefore the PUA code  
points map to different glyphs. I'm not sure what the situation is like on  
Mac or Linux, but given the market share statistics I saw the market is  
pretty heavenly dominated by Microsoft.

Gecko seems to use a combination of things as documented in  
https://bugzilla.mozilla.org/show_bug.cgi?id=310299 though it is unclear  
how successful that approach is.

There are also various threads online such as  
that seem to indicate "pages in the Hong Kong area" are not using the  
big5-hkscs label and therefore rely on what IE and Chrome do for big5 and  
rely on users having the compatible fonts.

Anne van Kesteren

More information about the whatwg mailing list