[whatwg] API for encoding/decoding ArrayBuffers into text

Mon Mar 19 18:14:10 PDT 2012

On Mon, Mar 19, 2012 at 7:33 PM, Jonas Sicking <jonas at sicking.cc> wrote:

> What value are we adding, and to whom, by keeping the list the
> smallest it can be, even when that means keeping the lists of
> supported encodings different between different APIs?
>

Not needlessly extending support for legacy encodings means there's no
chance of this API inadvertently causing proliferation of those encodings.
That benefits everyone who might come in contact with that data, and
increases the odds of being able to remove some of those encodings from the
platform entirely.

The concrete costs are that authors will have to learn which encodings
> work where, and that implementations need to keep separate lists of
> supported encodings in different APIs.
>

Authors don't need to learn that; all they care about is if the encoding
they're trying to use works.  Nobody memorizes lists of encodings.

Keeping a list of supported encodings is a trivial cost.

It also means that browsers need to be able to encode to each of these
encodings, and encoding for all of them needs to be specified, which I
think is currently unneeded.  (Unless we go the asymmetric
encoding/decoding route, supporting only decoders for legacy charsets.  If
this is the only reason that'd all have to be specified, that's probably
another reason to consider it...)

Supporting streaming decoding for modal encodings, such as ISO-2022-CN,
might also be a burden: it means implementations would be required to
support stateful, incremental decoding for that charset, which is more
complicated than most encodings (which are stateless).  Many
implementations probably do support that, but I don't think it's currently
mandatory, and it would complicate any streaming API.  Stateful encodings
need to die even more than other legacy encodings; I hope this API doesn't
have to support any of them.

-- 
Glenn Maynard