[whatwg] Superset encodings [Re: ISO-8859-* and the C1 control range]

NARUSE, Yui naruse at airemix.jp
Thu Oct 22 14:33:27 PDT 2009



Øistein E. Andersen wrote:
> On 22 Oct 2009, at 17:15, NARUSE, Yui wrote:
> 
>> First, JIS-X-0208 and JIS-X-0212 are not in IANA Charsets,
> 
> I am not sure what you mean; they are both listed at
> <http://www.iana.org/assignments/character-sets>:
> 
> Name: JIS_C6226-1983                                     [RFC1345,KXS2]
> MIBenum: 63
> Source: ECMA registry
> Alias: iso-ir-87
> Alias: x0208
> Alias: JIS_X0208-1983
> Alias: csISO87JISX0208

Where is the word "JIS-X-0208" ?

> Name: JIS_X0212-1990                                     [RFC1345,KXS2]
> MIBenum: 98
> Source: ECMA registry
> Alias: x0212
> Alias: iso-ir-159
> Alias: csISO159JISX02121990

Where is the word "JIS-X-0212" ?

>> moreover those correct names as spec are JIS X 0208 and JIS X 0212.
> 
> Please
> excuse me for not always paying due attention to such details in
> e-mails. Of course, the specifications should follow either IANA or the
> official standard as appropriate, depending on what it is referring to.)

Not for you, this sentense is in current HTML5 Draft 4.2.5.5.
That is why I paid attention.

>> Anyway, most of charsets defined RFC 1345 are not clear.
>> Conversion table between [those charsets and] Unicode is needed.
> 
> Quite.  Anne van Kesteren, I and several others are currently trying to
> document how browsers handle different encodings at
> <http://wiki.whatwg.org/wiki/Web_Encodings>, and defining mappings to
> Unicode is one of the goals.  Your contribution would be much appreciated.

ICU has large set of tables which likely to cover many MS Codepages.
(Of course it should be verified)
http://bugs.icu-project.org/trac/browser/data/trunk/charset/data/ucm

And I have a CP51932 table made from .NET Framework's Coonverter.
http://nkf.sourceforge.jp/ucm/cp51932.ucm

-- 
NARUSE, Yui  <naruse at airemix.jp>


More information about the whatwg mailing list