[whatwg] NoDatabase databases

Mon Aug 5 17:01:02 PDT 2013

On Thu, 2 May 2013, Brett Zamir wrote:
> 
> I wanted to propose (if work has not already been done in this area) 
> creating an HTTP extension to allow querying for retrieval and updating 
> of portions of HTML (or XML) documents where the server is so capable 
> and enabled, obviating the need for a separate database (or more 
> accurately, bringing the database to the web server layer).

> 1) Allowing one-off queries to be made by (privileged) user agents. This 
> avoids the need for websites willing to share their data to create their 
> own database APIs and overhead while allowing both the client and server 
> the opportunity to avoid delivering content which is not of interest to 
> the user. Possible query languages might include CSS selectors, XPath, 
> XQuery, or JavaScript.
> 
> 2) Allowing third-party websites the ability to make such queries of 
> other sites as in #1 but requiring user permission. I seem to recall 
> seeing some discussions apparently reviving the possibility for 
> JavaScript APIs to make cross-domain requests with user permission 
> regardless of the target site giving permission.

I don't really understand these use cases. When would you want to do this?

A use case really should be drawn all the way back to a user need, if 
possible. What user-facing application would you envisage where this 
feature would be needed to achieve a good experience?

> 3) The ability for user agents to allow the user to provide intelligent 
> defaults for navigating a subset of potentially large data documents, 
> potentially with the assistance of website mark-up, but without the need 
> for website scripting. This could reduce development time and costs, 
> while ensuring that powerful capabilities were enabled for users by 
> default on all websites (at least those that opted in by a simple 
> server-side configuration option). It could also avoid unnecessary 
> demands on the server and wait-time for the client (benefiting energy 
> usage, access in developing countries, wait-times anywhere for large 
> documents, etc.), while conversely facilitating exposure by sites of 
> large data-sets for users wishing to download a large data set. 
> Web-based IDEs, moreover, could similarly allow querying and editing of 
> these documents without needing to load and display the full data set 
> during editing. Some concrete examples include:
> 
>     a) Allowing ordered or unordered lists or definition/dialogue lists 
> or any hierarchical markup to be navigated upon user demand. The client 
> and server might, for example, negotiate the number of list items from a 
> list to be initially loaded and shown such that the entire list would 
> not be displayed or loaded but instead would load say only the first and 
> last 5 items in the list and give the user a chance to manually load the 
> rest if they were interested in viewing all of that data. Hierarchical 
> lists, moreover, could allow Ajax-like drill-down capabilities (or if 
> the user so configured their user agent, to automatically expand to a 
> certain depth), all without the author needing to provide any scripting, 
> letting them focus on content. Even non-list markup, like paragraphs, 
> could be drilled into, as well as providing ellipses when the child 
> content was determined to be above a given memory size or if the element 
> was conventionally used to provide larger amounts of data (e.g., a 
> textarea). (Form submission would probably need to be disabled though 
> until all child content was loaded, and again, in order to avoid usage 
> against the site's intended design, such navigation might require 
> opt-in.)

For this to be a good feature, the page would have to be very large (many 
megabytes, on modern connections), or, the user's connection would have to 
be really constrained (low bandwidth, though for sanity the latency would 
still have to be low).

What such cases exist, where scripting wouldn't be desireable for other 
reasons?

>     b) Tables would merit special treatment as a hierarchical type as 
> one may typically wish to ensure that all cells in a given row were 
> shown by default (though even here, ellipses could be added when the 
> data size was determined to be large), with pagination being the 
> well-used norm of table-based widgets. Having markup specified on column 
> headers (if not full-blown schemas) to indicate data types would be 
> useful in this regard (markup on the top level of a list might similarly 
> be useful); if the user agent were, for example, made aware of the fact 
> that a table column consisted exclusively of dates, it would provide a 
> search option to allow the user to display records between a given date 
> range (as well as better handling sorting).
> 
> Rows could, moreover be auto-numbered by the agent with an option to 
> choose a range of numbers (similarly ranges could be provided for other 
> elements, like paragraph or list item numbering, etc.). The shift to the 
> user agent might also encourage the ability to reorder or remove 
> columns.

I think that in practice we're going to reach practical limits in user 
agents rendering tables long before we're going to reach practical limits 
of document size. A multimegabyte table is going to cause layout problems 
before it takes appreciable time to download.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'