[whatwg] Client-side includes proposal

Shannon shannon at arc.net.au
Sun Aug 17 23:34:23 PDT 2008


The discussion on seamless iframes reminded me of something I've felt 
was missing from HTML - an equivalent client functionality to 
server-side includes as provided by PHP, Coldfusion and SSI. In 
server-side includes the document generated from parts appears as a 
single entity rather than nested frames. In other words the source code 
seen by the UA is indistiguishable from a non-frames HTML page in every way.

iframes are good for some things but they can be really messy when 
you're trying to build a single seamless page with shared styles and 
scripts from multiple files. It makes code reuse a real pain without 
relying on a server-side dynamic language. The seamless iframes proposal 
doesn't really address this well because you'll have more than one HTML 
and BODY element causing strange behaviour or complex exceptions with 
seamless CSS.

The other issue with iframes is that for many page snippets the concept 
of a title, meta tags and other headers don't make sense or simply 
repeat what was in the main document. More often than not the <head> 
section is meaningless yet must still be included for the frame to be 
"well-formed" or indexed by spiders.

The proposal would work like this:

--- Master Document ---
<html>
    <head>
       <title>Include Example</title>
       <meta name="includes" content="allow">
       <include src="global_head.ihtml">
    </head>
    <body>
          <include src="header.ihtml">
          <include src="http://www.pagelets.com/foo.ihtml">
          <include src="footer.ihtml">
    </body>
</html>

--- Header.html ---
<div id="header">
    <h1>Header</h1>
</div>


With this proposal seamless CSS would work perfectly because child 
selectors won't see an intervening <body> element between sections.

Includes should allow any html segments except the initial <doctype> and 
<head> (for reasons explained below) and should allow start and end tags 
to be split across includes. Only tags themselves may not contain an 
include (eg, <body <include src="body_attributes.ihtml">>). Many 
server-side includes allow this but it breaks the syntax of HTML/XML.

Includes must respect their own HTTP headers but inherit all other 
properties, styles and scripts from the surrounding page. If an include 
is not set to expire immediately the browser should reuse it from 
memory, otherwise it should retreive it once for each include. Each 
behaviour has its own merits depending on the application.

The standard would recommend (but not require) includes to use an .ihtml 
extension. This will make it easier for authors, UAs and logging systems 
to distinguish partial and complete pages (ie, not count includes 
towards page views in a stats package).

UAs or UA extensions like the Mozilla-based "Web Developer" should allow 
the user to view the actual source and the "final" source (with all 
includes substituted).

HTTP 1.1 pipelining should remove any performance concerns that includes 
would have over traditional SSI since the retrieval process only 
requires the sending of a few more bytes of request and response 
headers. In some ways it is actually better because UAs and proxies can 
cache the static includes and only fetch the dynamic parts.

The only real issue with this proposal is security for untrusted content 
like myspace profiles. Traditional sanitisers would be unfamiliar with 
<include> and may allow it through, providing a backdoor for malicious 
code. For this reason it is necessary that includes be opt-in. The 
simplest mechanism is to use a meta tag in the head of the master document:

<meta name="includes" content="allow">

I would consider any content system that allowed untrusted users to 
write their own head tags to be incurable insecure; however this 
requirement should ensure that the majority do not suddenly experience a 
wave of new exploits in HTML5 browsers.

Shannon



More information about the whatwg mailing list