[whatwg] Forbidden characters in text/html
Ian Hickson
ian at hixie.ch
Fri Mar 10 17:21:41 PST 2006
On Sat, 25 Feb 2006, Henri Sivonen wrote:
>
> On Feb 25, 2006, at 02:02, Ian Hickson wrote:
>
> > On Sat, 23 Jul 2005, Henri Sivonen wrote:
> > >
> > > Which characters should a text/html HTML5 conformance checker consider
> > > forbidden? The same characters that are forbidden in XML 1.0 (\0, FF,
> > > etc.)? Or some other set?
> >
> > In what context?
>
> In the pre-parse Unicode character stream on one hand and in the
> post-parse (that is NCRs expanded) character data and attribute values
> on the other. IIRC, in XML 1.0 (but not 1.1) the restrictions are the
> same in both cases.
Well, the spec says to drop U+0000, and do something with U+000D such that
U+000D never appears in the parse stream; the post-parse is just the DOM.
Does that answer your question?
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg
mailing list