[whatwg] CR "entities" and LFCR
Henri Sivonen
hsivonen at iki.fi
Fri Jun 8 05:17:31 PDT 2007
On Jun 7, 2007, at 15:00, Anne van Kesteren wrote:
> These should be converted to LF too. One thing that might be
> interesting to look into is the handling of LFCR in browsers (as
> opposed to CRLF). I haven't done that yet... Some browsers (just
> tested Opera) also normalize two newline entities following each
> other (CRLF pair).
This requires more code. I haven't analyzed the perf impact, but
intuitively this requires either naïve and inefficient buffer
retraversal in the tree builder or additional complexity to the
tokenizer's buffer management (assuming the tokenizer is doing
efficient buffering to begin with).
You can't protect the DOM from getting CRs if someone insists on
putting them there using JS or XML. Is it worthwhile to prevent
escaped CRs from ending up in the DOM as CRs in HTML? Is special
handling required for compat.
I'd try doing exactly what XML does here unless compat requires
otherwise.
--
Henri Sivonen
hsivonen at iki.fi
http://hsivonen.iki.fi/
More information about the whatwg
mailing list