[whatwg] API for encoding/decoding ArrayBuffers into text

Fri Mar 16 12:22:13 PDT 2012

On Fri, 16 Mar 2012, Glenn Maynard wrote:

> On Fri, Mar 16, 2012 at 11:19 AM, Joshua Bell <jsbell at chromium.org> wrote:
>
>> And just to be clear, the use case is decoding data formats where string
>> fields are variable length null terminated.
>>
>
> A concrete example is ZIP central directories.
>
> I think we want both encoding and destination to be optional. That leads us
>> to an API like:
>>
>> out_dict = stringEncoding.encode("string", opt_dict);
>>
>> .. where both out_dict and opt_dict are WebIDL Dictionaries:
>>
>> opt_dict keys: view, encoding
>>
>
>
>> out_dict keys: charactersWritten, byteWritten, output
>>
>
> The return value should just be a [NoInterfaceObject] interface.
> Dictionaries are used for input fields.
>
> Something that came up on IRC that we should spend some time thinking
> about, though: Is it actually important to be able to encode into an
> existing buffer?  This may be a premature optimization.  You can always
> encode into a new buffer, and--if needed--copy the result where you need it.
>
> If we don't support that, most of this extra stuff in encode() goes away.

Yes, I think we should focus on getting feature parity with e.g. python 
first -- i.e. not worry about decoding into existing buffers -- and add 
extra fancy stuff later if we find that there are actually usecases where 
avoiding the copy is critical. This should allow us to focus on getting 
the right API for the common case.

> If in-place decoding isn't really needed, we could have:
>
> newView = str.encode("utf-8"); // or {encoding: "utf-8"}
> str2 = newView.decode("utf-8");
> len = newView.find(0); // replaces stringLength, searching for 0 in the
> view's type; you'd use Uint16Array for UTF-16
>
> and encodedLength() would go away.

This looks like a big win to me.