[whatwg] XSLT and DOCTYPES

Elliotte Harold elharo at metalab.unc.edu
Thu Dec 18 08:03:35 PST 2008

I managed to miss this one when it went around the first time, but I 
really have to speak up now. The second half of 8.1.1 is unnecessary 
noise and complexity:

For the purposes of XSLT generators that cannot output HTML markup 
without a DOCTYPE, a DOCTYPE legacy string may be inserted into the 
DOCTYPE (in the position defined above). This string must consist of:

    1. One or more space characters.
    2. A string that is an ASCII case-insensitive match for the string 
    3. One or more space characters.
    4. A U+0022 QUOTATION MARK or U+0027 APOSTROPHE character (the quote 
    5. The literal string "XSLT-compat".
    6. A matching U+0022 QUOTATION MARK or U+0027 APOSTROPHE character 
(i.e. the same character as in the earlier step marked quote mark).

In other words, <!DOCTYPE HTML PUBLIC "XSLT-compat"> or <!DOCTYPE HTML 
PUBLIC 'XSLT-compat'>, case-insensitively except for the bit in quotes.

Since XSLT 1.0 can generate well-formed XHTML without any problems, 
there really is no need for this at all. Documents generated by XSLT 
that need to be conforming should simply be XHTML.

Furthermore, it is false that XSLT cannot generate an HTML 5 conforming 
DOCTYPE in HTML mode. As proof I present this stylesheet:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"

   <xsl:output indent="yes" method="html"/>

   <xsl:template match="/">
     <xsl:text disable-output-escaping='yes'><!DOCTYPE HTML></xsl:text>


and the following output:

$ xsltproc test.xsl http://www.cafeconleche.org/
<!DOCTYPE HTML><html></html>

This should work in any scenario in which the XSLT processor itself is 
serializing the output. If it's merely generating some sort of DOM or 
tree to pass to another process, then all bets are off. However in that 
scenario, other means of producing DOCTYPES are also not guaranteed 
since the DOCTYPE is not part of the XPath 1.0 data model. XSLT can 
promise a DOCTYPE only when it controls the serialization path, 
regardless of the technique you use to create it.

Most importantly, does it really make sense to add ever more cruft not 
the spec to support every legacy tool and language out there? What if we 
discover that K&R C won't do Unicode? or that some old versions of Java 
require tags to be upper cased? A spec like this should not be making 
special allowances for the languages that may be used to generate it.

This time I will request a specific action: delete this section 
completely. It has no place in the spec.

Elliotte Rusty Harold  elharo at metalab.unc.edu
Refactoring HTML Just Published!

More information about the whatwg mailing list