[whatwg] Web Archives
ian at hixie.ch
Tue May 13 02:55:53 PDT 2008
On Wed, 11 Apr 2007, Tyler Keating wrote:
> I apologize if I've missed this in the specification or mailing
> archives, but I have a suggestion related to standardizing web
> "archives" in HTML5. Currently, I know that Firefox uses Mozilla Archive
> Format (.maf), Internet Explorer and Opera use MIME HTML (.mht) and
> Safari uses its own format (.webarchive) for saving a web page and all
> of its resources into a single file. So clearly a standard would be
> beneficial in ensuring "archive" compatibility between browsers and I
> think it's suitable for that standard to reside in HTML5.
> I don't believe this would be very difficult to standardize and the
> solution may be nothing more than a collection of random files wrapped
> into a ZIP compressed archive with a unique extension similar to a JAR
> or ODF file. The unique extension would be recognized by browsers,
> email clients and editors, which could then extract and display the root
> file directly (ex. index.html). The root file would obviously contain
> in the archive so the internal structure may not be important and the
> browser would not need any new rules to interpret individual files once
> it has uncompressed the archive into memory. This would facilitate
> passing HTML based documents around that could be viewed with any
> browser, yet appear as a small single file.
There are some specifications for this kind of thing already, e.g.
multipart/related (RFC2387), and the derivative MHTML (RFC2557).
In HTML5, this can be somewhat achieved using the offline application
cache feature, with a cache manifest. But the right solution to address
the problems with MHTML are to develop a new RFC that addresses the
problems with MHTML, IMHO.
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg