[whatwg] U+FEFF (BOM) stripping in UTF-16BE and UTF-16LE

Øistein E. Andersen liszt at coq.no
Tue Sep 8 16:09:09 PDT 2009


§ 9.2.2.2 "Preprocessing the input stream" requires that a leading U 
+FEFF (byte order mark) be stripped irrespective of encoding, contra  
Unicode, which says that a leading U+FEFF is part of the document when  
the byte order is already established by other means.  This is  
probably harmless and potentially useful to deal with bislabelled  
documents, but it might be worth adding an explanatory note.

-- 
Øistein E. Andersen


More information about the whatwg mailing list