[whatwg] several messages about handling encodings in HTML

Øistein E. Andersen html5 at xn--istein-9xa.com
Mon Mar 3 16:53:55 PST 2008

On Fri, 29 Feb 2008 01:21:20 +0000 (UTC), Ian Hickson wrote:

> (I've made the characters not allowed in XML also not allowed in HTML, 
> with the exception of some of the space characters which we need to have 
> allowed for legacy reasons.)

The C1 character U+0085 NEXT LINE (NEL) is also a Unicode space character,
and this one is neither disallowed nor discouraged in XML as far as
I can tell.  I am not sure if we really want to support this character, though;
Opera, Safari and Firefox do not seem to recognise it at all, and one IE7
installation seems to treat it as a non-breakable wide space, but this may well
be font-dependent.  (Allowing this character could be confusing given that
… does not refer to U+0085, but rather to an ellipsis for compatibility
with Windows-1252.)

More importantly, the current draft seems to allow C0 (not only white space) controls
and delete, as well as U+FDD0 to U+FDDF and the non-characters *FE and *FF
when these are expressed as character references.  Would it be possible to
(dis)allow the same set of characters in both cases?

Øistein E. Andersen

More information about the whatwg mailing list