[whatwg] several messages about handling encodings in HTML

Ian Hickson ian at hixie.ch
Thu May 22 20:20:20 PDT 2008

On Tue, 4 Mar 2008, Øistein E. Andersen wrote:
> On Fri, 29 Feb 2008 01:21:20 +0000 (UTC), Ian Hickson wrote:
> > (I've made the characters not allowed in XML also not allowed in HTML, 
> > with the exception of some of the space characters which we need to 
> > have allowed for legacy reasons.)
> The C1 character U+0085 NEXT LINE (NEL) is also a Unicode space 
> character, and this one is neither disallowed nor discouraged in XML as 
> far as I can tell.  I am not sure if we really want to support this 
> character, though; Opera, Safari and Firefox do not seem to recognise it 
> at all, and one IE7 installation seems to treat it as a non-breakable 
> wide space, but this may well be font-dependent.  (Allowing this 
> character could be confusing given that … does not refer to U+0085, 
> but rather to an ellipsis for compatibility with Windows-1252.)

I consciously excluded it.

> More importantly, the current draft seems to allow C0 (not only white 
> space) controls and delete, as well as U+FDD0 to U+FDDF and the 
> non-characters *FE and *FF when these are expressed as character 
> references.  Would it be possible to (dis)allow the same set of 
> characters in both cases?

Yeah, this was fixed yesterday.

Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

More information about the whatwg mailing list