[whatwg] API for encoding/decoding ArrayBuffers into text
Glenn Maynard
glenn at zewt.org
Mon Mar 26 16:12:22 PDT 2012
On Mon, Mar 26, 2012 at 4:49 PM, Joshua Bell <jsbell at chromium.org> wrote:
> * A |stream| option, per the above
>
Does this make sense when you're using stream: false to flush the stream?
It's still a streaming operation. I guess it's "close enough".
* A |nullTerminator| option eliminates the need for a stringLength method
> (hasta la vista, baby!)
>
I strongly disagree with this change. It's much cleaner and more generic
for the decoding algorithm to not know anything about null terminators, and
to have separate general-purpose methods to determine the length of the
string (memchr/wmemchr analogs, which we should have anyway). We made this
simplification a long time ago--why did you resurrect this?
array = new Int8Array(myArrayBuffer);
length = array.indexOf(0); // same semantics as String.indexOf
if(length != -1)
array = array.subarray(0, length);
new TextDecoder('utf-8').decode(array);
* BOM handling needs to be resolved. The Encoding spec makes the encoding
> label secondary to the BOM. With this API it's unclear if that should be
> the case. Options include having a mismatching BOM throw, treating a
> mismatching BOM as a decoding error (i.e. fallback or throw, depending on
> options), or allow the BOM to actually switch the decoder used for this
> "stream" - possibly if-and-only-if the default encoding was specified.
>
The path of fewest errors is probably to have a BOM override the specified
UTF-16 endianness, so saying "UTF-16BE" just changes the default.
An aside:
The TypedArray constructors have a depressing design bug: new
Int8Array(someOtherView) makes a copy of the data. It's nonsensical that
view constructors create a view when passed an ArrayBuffer, but a copy when
passed another view. This doesn't make any kind of sense; creating a view
should create a *view* if it's passed an object that already has
ArrayBuffer-based storage, and making a copy should have been its own
operation.
This means we can't say "creating a view is cheap"; we have to qualify it:
"creating a view is cheap, as long as you're careful not to call a
constructor that makes a copy".
It's frustrating that we're now stuck with a confusing, inconsistent API
like this. I'm sure it's much too late to fix this properly, but hopefully
an option can be added to fix it, so a new TypedArray(TypedArray, {view:
true}) call actually creates a view.
--
Glenn Maynard
More information about the whatwg
mailing list