[Imps] HTML 4.01 compatibility modes for an HTML5 parser

Henri Sivonen hsivonen at iki.fi
Thu Jun 28 05:53:33 PDT 2007


Recognizing that there are people who want to treat the HTML 4.01  
doctypes as non-errors for the time being, my old prototype parser  
had four modes for dealing with the HTML 4.01 legacy as interesting  
to users today. To avoid regressing on functionality with my current  
replacement parser project, I've been thinking that I should retain  
those four modes.

The modes are now drafted as follows:

     /**
      * Be a pure HTML5 parser.
      */
     HTML5,

     /**
      * Require the HTML 4.01 Transitional public id. Turn on HTML4- 
specific
      * additional errors regardless of doctype.
      */
     HTML401_TRANSITIONAL,

     /**
      * Require the HTML 4.01 Transitional public id and a system id.  
Turn on
      * HTML4-specific additional errors regardless of doctype.
      */
     HTML401_STRICT,

     /**
      * Treat the HTML5 doctype, doctypes with the HTML 4.01 Strict  
public id and
      * doctypes with the HTML 4.01 Transitional public id and a  
system id as
      * non-errors. Turn of HTML4-specific additional errors if the  
public id is
      * the HTML 4.01 Strict or Transitional public id.
      */
     AUTO

Does this seem reasonable? Are there additional modes that would be  
such low-hanging fruit that I should offer more modes? On the other  
hand, is there something wrong with offering these modes?

Note that not providing modes for Appendix C checking is a deliberate  
choice to better manage how I use my time.

-- 
Henri Sivonen
hsivonen at iki.fi
http://hsivonen.iki.fi/





More information about the Implementors mailing list