[whatwg] Web Documents off the Web (was Web Archives)

Tyler Keating tylerkeating at mac.com
Mon Apr 16 13:39:56 PDT 2007


I'm bringing this up again with a different tact, because the more  
that I think about it, the more I believe it has the ability to  
significantly change the perception and application of HTML and I  
would really like to keep the discussion alive.  In the previous  
thread, I proposed a standard for archiving web sites into a single  
ZIP archive with a unique file extension and although it didn't get  
any outright negative feedback, it didn't drum up too much excitement  
either.  If you can bear with me, I'd like to describe the idea again  
in a slightly different light.

Take for example, web-based presentations vs. PowerPoint from an  
average user's point-of-view.  I can create an incredibly dynamic  
presentation based on HTML, JavaScript, CSS, SVG, etc., but I can't  
easily share it with anyone unless it is served (I can't easily send  
it to them).  On the other hand, I can create an incredibly dynamic  
presentation using PowerPoint, but I can't easily share it with  
anyone unless I send them the file and they also have PowerPoint (I  
can't easily serve it).*

For another example, which relates to my modest experience, I've  
created a simple Quotes/Sales/Invoices web app for a friend and have  
come across similar issues trying to resolve the served file model  
with the local file model.  Without going into too much detail,  
assume that there is sufficient reason why a file copy of the web  
page is needed (in this case because my friend's customers can't use  
the app directly).  How should the user get copies of web documents  
to be sent or saved to disk?  Instead of describing all of the  
various options of saving it to some kind of browser proprietary  
archive, sending HTML email, creating an HTML-to-PDF converter or  
some other time-consuming non-user friendly method, let's look at an  
ideal solution.

Imagine this:  An HTML based document ZIP compressed into a single  
file could be uploaded as is to the server.  Clicking on a link to  
the file would probably download, decompress and open the file in the  
browser seamlessly and, even better, right-clicking on the link  
instead and choosing "Download Linked File" would download the same  
nice small single file.**  Double clicking that file would open it in  
any browser identically as to the served version.  The identical  
format and behaviour of the web document and the file document  
presents the best user experience.  Instead of saving a  
representation of the web document, you are saving THE web document.

The question is, why do we only think of HTML with respect to the web  
and why are HTML-based documents constrained to being served?  This  
is the meat of my argument.  Browsers have no issue opening a file  
URI, but humans have an issue dealing with a directory of .html files  
versus, say, a single .ppt file.  Humans will soon also have issues  
viewing and serving ODF and OOXML files, I might add, but still won't  
have issues viewing and serving HTML files.  After the little bit of  
discussion from the first thread, I believe that the solution is  
indeed a near clone and more complete version of the Widgets 1.0  
specification ( http://www.w3.org/TR/WAPF-REQ/ ) as something  
different and as part of HTML, specifying how to package entire web  
documents as zip compressed archives using a unique file extension.   
In reality, compared to all of the other work being done on HTML, I  
believe this would be very simple to specify and should be very  
simple to implement.

Please give this some thought.  I appreciate your comments.

Tyler Keating
CEO Concept Digital Inc.  -- don't be impressed, it's just me

* I could export an HTML version to be served, but I can't share both  
ways with the same file and this means I have two versions of the  
same presentation to work with.  Again, the average user (my mom)  
isn't going to be serving files created on their desktop any time too  
soon, since she has just barely grasped email attachments.
** Containing any number of HTML, XHTML, CSS, image or other files  
inside of it invisible to the average user.

More information about the whatwg mailing list