[whatwg] XSLT and DOCTYPES

Ian Hickson ian at hixie.ch
Thu Dec 25 03:24:30 PST 2008


I haven't made any changes in response to the comments below. The 
XSLT-compat feature is available due to a number of requests, and I don't 
really see any harm in keeping it, even if not everybody needs it.

I haven't changed the keyword to something else, mostly because I haven't 
found a better name yet. This is still an open issue. ("XSLT-compat" as a 
name is convenient, especially for non-XSLT purposes, precisely because it 
grates on people's sensibilities. Something "neater" would have less of a 
discouraging effect. So maybe we should keep the name as is regardless.)

On Thu, 18 Dec 2008, Elliotte Harold wrote:
>
> I managed to miss this one when it went around the first time, but I really
> have to speak up now. The second half of 8.1.1 is unnecessary noise and
> complexity:
> 
> For the purposes of XSLT generators that cannot output HTML markup without a
> DOCTYPE, a DOCTYPE legacy string may be inserted into the DOCTYPE (in the
> position defined above). This string must consist of:
> 
>    1. One or more space characters.
>    2. A string that is an ASCII case-insensitive match for the string
> "PUBLIC".
>    3. One or more space characters.
>    4. A U+0022 QUOTATION MARK or U+0027 APOSTROPHE character (the quote mark).
>    5. The literal string "XSLT-compat".
>    6. A matching U+0022 QUOTATION MARK or U+0027 APOSTROPHE character (i.e.
> the same character as in the earlier step marked quote mark).
> 
> In other words, <!DOCTYPE HTML PUBLIC "XSLT-compat"> or <!DOCTYPE HTML PUBLIC
> 'XSLT-compat'>, case-insensitively except for the bit in quotes.
> 
> Since XSLT 1.0 can generate well-formed XHTML without any problems, there
> really is no need for this at all. Documents generated by XSLT that need to be
> conforming should simply be XHTML.
> 
> Furthermore, it is false that XSLT cannot generate an HTML 5 conforming
> DOCTYPE in HTML mode. As proof I present this stylesheet:
> 
> <?xml version="1.0"?>
> <xsl:stylesheet version="1.0"
>   xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
> 
>   <xsl:output indent="yes" method="html"/>
> 
>   <xsl:template match="/">
>     <xsl:text disable-output-escaping='yes'><!DOCTYPE HTML></xsl:text>
>      <html>
>      </html>
>   </xsl:template>
> 
> </xsl:stylesheet>
> 
> 
> and the following output:
> 
> $ xsltproc test.xsl http://www.cafeconleche.org/
> <!DOCTYPE HTML><html></html>
> 
> 
> This should work in any scenario in which the XSLT processor itself is
> serializing the output. If it's merely generating some sort of DOM or tree to
> pass to another process, then all bets are off. However in that scenario,
> other means of producing DOCTYPES are also not guaranteed since the DOCTYPE is
> not part of the XPath 1.0 data model. XSLT can promise a DOCTYPE only when it
> controls the serialization path, regardless of the technique you use to create
> it.
> 
> Most importantly, does it really make sense to add ever more cruft not the
> spec to support every legacy tool and language out there? What if we discover
> that K&R C won't do Unicode? or that some old versions of Java require tags to
> be upper cased? A spec like this should not be making special allowances for
> the languages that may be used to generate it.
> 
> This time I will request a specific action: delete this section completely. It
> has no place in the spec.

On Thu, 18 Dec 2008, Julian Reschke wrote:
>
> Elliotte Harold wrote:
> > ...
> > Since XSLT 1.0 can generate well-formed XHTML without any problems, there
> > really is no need for this at all. Documents generated by XSLT that need to
> > be conforming should simply be XHTML.
> > ...
> 
> Now if you can persuade Microsoft to implement XHTML, that might fly.
> 
> > Furthermore, it is false that XSLT cannot generate an HTML 5 conforming
> > DOCTYPE in HTML mode. As proof I present this stylesheet:
> > 
> > <?xml version="1.0"?>
> > <xsl:stylesheet version="1.0"
> >   xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
> > 
> >   <xsl:output indent="yes" method="html"/>
> > 
> >   <xsl:template match="/">
> >     <xsl:text disable-output-escaping='yes'><!DOCTYPE HTML></xsl:text>
> >      <html>
> >      </html>
> >   </xsl:template>
> > 
> > </xsl:stylesheet>
> > 
> > 
> > and the following output:
> > 
> > $ xsltproc test.xsl http://www.cafeconleche.org/
> > <!DOCTYPE HTML><html></html>
> 
> Doesn't work with Firefox' builtin XSLT engine which ignores d-o-e (and is
> allowed to do so).
> 
> > ...
> > Most importantly, does it really make sense to add ever more cruft not the
> > spec to support every legacy tool and language out there? What if we
> > discover that K&R C won't do Unicode? or that some old versions of Java
> > require tags to be upper cased? A spec like this should not be making
> > special allowances for the languages that may be used to generate it.
> > 
> > This time I will request a specific action: delete this section completely.
> > It has no place in the spec.
> > ...
> 
> I totally disagree.
> 
> The spec also fails to mention that there are more use cases than XSLT;
> several HTML serialization methods share this restriction with XSLT's HTML
> output mode. Thus, the spec should continue to allow this, but pick a more
> correct name.

On Thu, 18 Dec 2008, Jonas Sicking wrote:
> >
> > This should work in any scenario in which the XSLT processor itself is
> > serializing the output. If it's merely generating some sort of DOM or tree
> > to pass to another process, then all bets are off. However in that scenario,
> > other means of producing DOCTYPES are also not guaranteed since the DOCTYPE
> > is not part of the XPath 1.0 data model. XSLT can promise a DOCTYPE only
> > when it controls the serialization path, regardless of the technique you use
> > to create it.
> 
> Why does this matter at all if your XSLT outputs a DOM? In such a case
> all you need to do is to ensure that the right type of nodes are
> created. Adding a DOCTYPE of any sort isn't going to affect anything
> as far as I can tell, at least not from an HTML perspective.
> 
> The XSLT spec (at least the 1.0 one) is actually very vague on what
> the result from the transformation is when the output isn't a
> serialized sequence of bytes. I.e. the only mention of how to output
> HTML is in section 16 which is about serialized results, i.e. it
> technically doesn't apply when the result is a DOM. In firefox we do
> however use the HTML output mode to create HTML DOM nodes.
> 
> In any event, this would seem like something that needs to be defined
> in the producer of the DOM, not in HTML in general. Although we could
> call out some specific producers if needed.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'



More information about the whatwg mailing list