[whatwg] CR "entities" and LFCR

Michael A. Puls II shadow2531 at gmail.com
Fri Jun 8 11:53:54 PDT 2007


On 6/8/07, Anne van Kesteren <annevk at opera.com> wrote:
> On Thu, 07 Jun 2007 23:12:38 +0200, Michael A. Puls II
> <shadow2531 at gmail.com> wrote:
> > On 6/7/07, Anne van Kesteren <annevk at opera.com> wrote:
> >> These should be converted to LF too. One thing that might be interesting
> >> to look into is the handling of LFCR in browsers (as opposed to CRLF). I
> >> haven't done that yet... Some browsers (just tested Opera) also
> >> normalize
> >> two newline entities following each other (CRLF pair).
> >
> > Not sure if it'll help, but whenever I do newline normalization to LF, I:
> >
> > Convert all CR + LF pairs to LF.
> > Then, I convert any CRs left over to LF.
>
> Sure, that's what the specification says to do as well. I was wondering if
> some user agents do something special for LFCR. For instance, if I
> remember correctly using \n\r in JavaScript gives a single newline in
> Firefox and two in Opera.

I believe Boris told me for FF, newline normalization (including
entities) is only done for parsing into the DOM and that any setting
of a string property in JS does zero newline normalization. So, if you
set \n\r, \n\r is stored as-is (which we visually equivalent as having
2 newlines) and if there needs to be any normalization, it needs to be
done by the author of the JS code.

As a side note, when checking how newlines are stored in js, I usually
do alert(encodingURIComponent(element.nodeValue)) for example, so I
can for sure see what newline characters are present.

-- 
Michael



More information about the whatwg mailing list