[whatwg] Zip archives as first-class citizens

Boris Zbarsky bzbarsky at MIT.EDU
Wed Aug 28 08:04:08 PDT 2013

On 8/28/13 9:32 AM, Anne van Kesteren wrote:
> I'm not sure we need to consider sub-scheme if zip-path can work as
> it's more complex and not very well thought out. E.g. imagine
> view-source:zip:http://www.example.org/zip!test.html.

What's the issue with that?  Gecko supports that (with jar:, not zip:), 

My concerns with the zip-path approach are as follows:

1)  It requires doing the zip processing in a new layer on top of 
whatever pluggable architecture you have for schemes.  The zip: approach 
nicely encapsulates things so that the protocol handler for zip: 
delegates to the inner URI for the archive fetch and then knows how to 
process it.  It might be possible to do the zip processing by totally 
rewriting how browsers do fetch to interpose this zip-processing layer, 
but that seems like a nontrivial undertaking compared to having an 
orthogonal zip: handler that's invoked explicitly.  I would be 
interested in knowing what other implementors think about how 
implementable the two options are in their architectures.

2)  It changes semantics of existing URIs that happen to contain %!. 
I'm specifically worried about data: URIs, though Gordon points out that 
some http URIs may also be affected.

3)  We have implementation experience with the "sub-scheme" approach and 
we know it can work just fine (existence proof is jar: in Gecko).  The 
main difficulty it introduces is that computing the origin needs to be 
done via object accessors, not string-parsing...  Do we have any 
implementation experience with "zip-path"-like approaches?

> As for nested zip archives. Andrea suggested we should support this,
> but that would require zip-path to be a sequence of paths. I think we
> never went to allow relative URLs to escape the top-most zip archive.
> But I suppose we could support in a way that
>    %!test.zip!test.html
> goes one level deeper. And "../image.gif" in test.html looks in the
> enclosing zip.

I don't think relative URIs should ever escape a zip archive (though I 
do appreciate the way that would let someone replace directories with 
zipped-up versions of those directories).  The reason for that is that 
allowing it sometimes but not others seems really weird to me, and it 
seems like we don't want to allow it for toplevel zip archives.


More information about the whatwg mailing list