[whatwg] BinaryEncoding for Typed Arrays using window.btoa and window.atob

Tue Aug 13 11:28:28 PDT 2013

On Mon, Aug 12, 2013 at 4:50 PM, Glenn Maynard <glenn at zewt.org> wrote:

> On Mon, Aug 12, 2013 at 12:16 PM, Joshua Bell <jsbell at google.com> wrote:
>
>> To recap history: early iterations of the Encoding API proposal did have
>> base64 but it was removed with the suggestion to extend atob()/btoa()
>> instead, and due to the confusion around the encode/decode verbs. If the
>> APIs were something like StringToBytesConverter::convert() and
>> BytesToStringConverter::convert() it would make more sense for encoding of
>> both text (use StringToBytes) and binary data (use BytesToString).
>>
>
> I thought about suggesting something like "StringToBytes", but that seems
> less obvious for the (probably) more common usage of encoding/decoding a
> String, and it's still a bit off (though not *strictly* wrong) for
> converting to UTF-16, UTF-32, etc.  I tend to think the slightly
> unintuitive names of TextEncoder and TextDecoder aren't bad enough that
> it's worth renaming them.
>

For completeness, it's also worth bringing up
https://developer.mozilla.org/en-US/docs/Code_snippets/StringView which
started this round of discussion (over on blink-dev) which is another more
"neutral" API design for binary/string data interop. I haven't read it
deeply, but it looks like it doesn't handle the streaming case, but does
explicitly tackle base64 without overloading text encoding methods.

>
>>  While we're re-opening this can of worms, there's been a request to add
>> a flush() method to the TextEncoder/TextDecoder objects, which would behave
>> the same as calling encode(null, {stream: false}) / decode(null,
>> {stream:false}) but make the code more readable. This fails the "adding a
>> new method for something that behaves exactly like something we already
>> have" test. Opinions?
>>
>
> I think you only need to say encode() and decode(), which is less of a
> win, especially since creating two ways of doing the same thing means that
> people have to learn both ways.  Otherwise, they'll see code end with
> ".encode()" and not realize that it's the same as the ".finish()" they've
> been using.
>

True. (I need to go back through this and other feedback that's trickled in
and see if I'm mis-representing it, and see if there's anything else
lingering.)

>
> On Mon, Aug 12, 2013 at 6:26 PM, Jonas Sicking <jonas at sicking.cc> wrote:
>
>> I don't think that base64 encoding fits with the current
>> TextEncoder/Decoder API. Not because of names, but because base64
>> encoding is by nature opposite. I.e. the encoded format is in string
>> form, whereas the decoded format is in binary form.
>>
>
> The names are the only things that are opposite.  TextEncoder is just a
> streaming String-to-binary-blob conversion API, and TextDecoder is just a
> streaming binary-blob-to-String API, and that's precisely what base64
> encoding and decoding are.  That's the same whether you're converting
> String-to-base64 or String-to-UTF-8.  The only difference is that the names
> we've given to those ideas are reversed here.
>

Yes.

>
> One thing that might need special attention is that U+FFFD error handling
> doesn't make sense for base64; errors should probably always be fatal.
>

Excellent point.

...

I believe we may experiment with "api-base64" and see if there are other
gotchas beyond this and the naming.

>
>
> --
> Glenn Maynard
>
>