[whatwg] Superset encodings [Re: ISO-8859-* and the C1 control range]
Ian Hickson
ian at hixie.ch
Tue Jul 7 01:25:58 PDT 2009
On Tue, 9 Jun 2009, Anne van Kesteren wrote:
> On Tue, 09 Jun 2009 01:42:57 +0200, Ãistein E. Andersen <liszt at coq.no> wrote:
> > Le 5 juin 09, Anne van Kesteren écrivit :
> >>
> >> Is the implication here that Shift_JIS and Shift-JIS are distinct
> >> [...]?
> >
> > No, Shift-JIS and Windows-932 are commonly used names/labels for the
> > encodings that are registered as Shift_JIS and Windows-31J
> > (respectively) in the IANA charset registry. Sorry for the confusion
> > caused.
>
> So should HTML5 mention that Windows-932 maps to Windows-31J? (It does
> not appear in the IANA registry.)
I've added this mapping too, just in case.
On Tue, 9 Jun 2009, Øistein E. Andersen wrote:
>
> That is an interesting question. My (apparently wrong) understanding was
> that the table was merely supposed to provide mappings between
> encodings, since such mappings are inappropriate in non-HTML contexts
> and cannot be added to the IANA registry. It might be to useful to
> include a set of MIME charset strings which cannot be or have not yet
> been registered (e.g., x-x-big5, x-sjis, windows-932) as well as
> information on how CJK character sets are implemented in practice, both
> of which seem to be necessary for compatibility.
>
> Such information does not fit comfortably in the current table, though.
Added x-sjis. What are the other mappings that would be good?
On Tue, 9 Jun 2009, Øistein E. Andersen wrote:
> >
> > I believe you misunderstand the purpose of this table. The idea is to
> > give a mapping of _labels_ to encodings, not encodings to encodings.
> > I've clarified the text to this effect.
>
> You seem to have added "specified by a label" to the phrase which now
> reads "an encoding specified by a label given in the first column of the
> following table" without changing the column heading ("Input encoding")
> and without defining what a "label" actually is. The reference to
> "encoding aliasing" is also intact, which seems misleading if the table
> is not supposed to map between encodings.
I've split the table in two to avoid this issue.
Earlier, you wrote:
>
> GB2312 and GB_2312-80 technically refer to the *character set* GB
> 2312-80, [...]. GBK, on the other hand, is an encoding.
As far as I can tell, GB2312 and GB_2312-80 are two different encodings
according to IANA.
On Wed, 10 Jun 2009, Anne van Kesteren wrote:
>
> I would prefer them being added to the IANA registry.
I've noted that I should do that.
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg
mailing list