[whatwg] Zip archives as first-class citizens

Glenn Maynard glenn at zewt.org
Thu Aug 29 08:27:16 PDT 2013

On Wed, Aug 28, 2013 at 12:25 PM, Eric Uhrhane <ericu at chromium.org> wrote:

> Broken files don't work, and I'm OK with that.  I'm saying that legal
> zips can have multiple directories, where the definitive one is last
> in the file, so it's not a good format for streaming.  If you're
> saying that you want to change the format to make an earlier directory
> definitive, that's going to break compat for the existing archives
> everywhere, and would be confusing enough that we should just go with
> a different archive format that doesn't require changes.  Or at least
> don't call it zip when you're done messing with the spec.

I'm saying that if the directories are out of sync, the filenames are going
to be broken in existing clients already.  We should only try to guarantee
that files always work if their internal data is consistent.  If their
records are out of sync, then we should only ensure that the files work the
same in all browsers, even if there are some files that won't work nicely
as a result.

That said, we don't actually have use cases or a feature proposal for
streaming from ZIPs, so it's hard to make any further analysis.  The
feature we're discussing here doesn't need streaming, only random access.
It wouldn't read the whole ZIP, it would just read the end of the file to
grab the central directory (which gives you the information you need to
decide what to read from there).

(The access patterns of having to read the central directory first aren't
ideal for optimizing away fetches, since Content-Range has no way of saying
"give me the last 64K of the file" so you have to ask for the size first,
but I'd rather that than introducing a new archive format into the wild...)

Glenn Maynard

More information about the whatwg mailing list