[whatwg] several messages about handling encodings in HTML

Øistein E. Andersen html5 at xn--istein-9xa.com
Mon Mar 3 16:53:55 PST 2008

On Fri, 29 Feb 2008 01:21:20 +0000 (UTC), Ian Hickson wrote:

> (I've made the characters not allowed in XML also not allowed in HTML, 
> with the exception of some of the space characters which we need to have 
> allowed for legacy reasons.)

The C1 character U+0085 NEXT LINE (NEL) is also a Unicode space character,
and this one is neither disallowed nor discouraged in XML as far as
I can tell.  I am not sure if we really want to support this character, though;
Opera, Safari and Firefox do not seem to recognise it at all, and one IE7
installation seems to treat it as a non-breakable wide space, but this may well
be font-dependent.  (Allowing this character could be confusing given that
… does not refer to U+0085, but rather to an ellipsis for compatibility
with Windows-1252.)

More importantly, the current draft seems to allow C0 (not only white space) controls
and delete, as well as U+FDD0 to U+FDDF and the non-characters *FE and *FF
when these are expressed as character references.  Would it be possible to
(dis)allow the same set of characters in both cases?

Øistein E. Andersen

