[whatwg] Forbidden characters in text/html

Ian Hickson ian at hixie.ch
Fri Mar 10 17:21:41 PST 2006


On Sat, 25 Feb 2006, Henri Sivonen wrote:
>
> On Feb 25, 2006, at 02:02, Ian Hickson wrote:
> 
> > On Sat, 23 Jul 2005, Henri Sivonen wrote:
> > > 
> > > Which characters should a text/html HTML5 conformance checker consider
> > > forbidden? The same characters that are forbidden in XML 1.0 (\0, FF,
> > > etc.)? Or some other set?
> > 
> > In what context?
> 
> In the pre-parse Unicode character stream on one hand and in the 
> post-parse (that is NCRs expanded) character data and attribute values 
> on the other. IIRC, in XML 1.0 (but not 1.1) the restrictions are the 
> same in both cases.

Well, the spec says to drop U+0000, and do something with U+000D such that 
U+000D never appears in the parse stream; the post-parse is just the DOM.

Does that answer your question?

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'



More information about the whatwg mailing list