[whatwg] U+FEFF (BOM) stripping in UTF-16BE and UTF-16LE

Ian Hickson ian at hixie.ch
Mon Sep 14 20:42:31 PDT 2009


On Wed, 9 Sep 2009, Øistein E. Andersen wrote:
>
> § 9.2.2.2 "Preprocessing the input stream" requires that a leading 
> U+FEFF (byte order mark) be stripped irrespective of encoding, contra 
> Unicode, which says that a leading U+FEFF is part of the document when 
> the byte order is already established by other means.  This is probably 
> harmless and potentially useful to deal with bislabelled documents, but 
> it might be worth adding an explanatory note.

Fixed.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


More information about the whatwg mailing list