[whatwg] 9.2.2: replacement characters. How many?
elharo at metalab.unc.edu
Fri Nov 3 03:52:17 PST 2006
Section 9.2.2 of the current Web Apps 1.0 draft states:
Bytes or sequences of bytes in the original byte stream that could not
be converted to Unicode characters must be converted to U+FFFD
REPLACEMENT CHARACTER code points.
I'm concerned about the "or". For example, suppose there are six upper
halves of a Unicode surrogate pair in a row and no lower halves. Does
that turn into six replacement characters or one? Both interpretations
I suppose I prefer six rather than one, but I don't care a great deal as
long as this is locked down one way or the other.
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
More information about the whatwg