[whatwg] CR "entities" and LFCR

Kristof Zelechovski giecrilj at stegny.2a.pl
Thu Jun 7 23:24:09 PDT 2007


Reading a file in text mode ignores all carriage return control characters.
Stray carriage returns are ignored as well.  
I do not think Macintosh text files should be allowed on the Web without
encoding.
Chris

-----Original Message-----
From: whatwg-bounces at lists.whatwg.org
[mailto:whatwg-bounces at lists.whatwg.org] On Behalf Of Michel Fortin
Sent: Friday, June 08, 2007 3:19 AM
To: WHATWG List
Subject: Re: [whatwg] CR "entities" and LFCR

Le 2007-06-07 à 17:12, Michael A. Puls II a écrit :

> Not sure if it'll help, but whenever I do newline normalization to  
> LF, I:
>
> Convert all CR + LF pairs to LF.
> Then, I convert any CRs left over to LF.
>
> Examples:
>
> LF + CR + LF + CR -> LF + LF + LF.
>
> CR + CR + LF -> LF + LF.

I think that's the standard way of doing it. Quoting Markdown source  
code, and some Perl code found on Wikipedia [1]:

     s/(\r\n|\n|\r)/\n/g

it does exactly that.

  [1]: http://en.wikipedia.org/wiki/Newline#Conversion_utilities

Windows use CR+LF, UNIX uses LF, legacy Mac applications still use  
CR; but I'm not aware of any system using LF+CR (and there is none on  
Wikipedia) and I don't think it's useful to give a meaning to it.


Michel Fortin
michel.fortin at michelf.com
http://www.michelf.com/





More information about the whatwg mailing list