[html5] r1274 - /

Thu Feb 28 15:26:36 PST 2008

Author: ianh
Date: 2008-02-28 15:26:32 -0800 (Thu, 28 Feb 2008)
New Revision: 1274

Modified:
   index
   source
Log:
[e] (0) remove 'BOM' from the table of encoding names. add a note saying that encoding errors are still errors.

Modified: index
===================================================================

--- index	2008-02-28 22:52:47 UTC (rev 1273)
+++ index	2008-02-28 23:26:32 UTC (rev 1274)
@@ -38012,31 +38012,31 @@
       <tr>
        <th>Bytes in Hexadecimal
 
-       <th>Description
+       <th>Encoding
 
      <tbody><!-- nobody uses this
       <tr>
        <td>00 00 FE FF
-       <td>UTF-32BE BOM
+       <td>UTF-32BE
       <tr>
        <td>FF FE 00 00
-       <td>UTF-32LE BOM
+       <td>UTF-32LE
 -->
 
       <tr>
        <td>FE FF
 
-       <td>UTF-16BE BOM
+       <td>UTF-16BE
 
       <tr>
        <td>FF FE
 
-       <td>UTF-16LE BOM
+       <td>UTF-16LE
 
       <tr>
        <td>EF BB BF
 
-       <td>UTF-8 BOM <!-- nobody uses this
+       <td>UTF-8 <!-- nobody uses this
       <tr>
        <td>DD 73 66 73
        <td>UTF-EBCDIC
@@ -38044,6 +38044,8 @@
         
     </table>
 
+    <p class=note>This step looks for Unicode Byte Order Marks (BOMs).
+
    <li>
     <p>Otherwise, the user agent will have to search for explicit character
      encoding information in the file itself. This should proceed as follows:
@@ -38421,6 +38423,11 @@
    be converted to Unicode characters must be converted to U+FFFD REPLACEMENT
    CHARACTER code points.
 
+  <p class=note>Bytes or sequences of bytes in the original byte stream that
+   did not conform to the encoding specification (e.g. invalid UTF-8 byte
+   sequences in a UTF-8 input stream) are errors that conformance checkers
+   are expected to report.
+
   <p>One leading U+FEFF BYTE ORDER MARK character must be ignored if any are
    present.
 

Modified: source
===================================================================
--- source	2008-02-28 22:52:47 UTC (rev 1273)
+++ source	2008-02-28 23:26:32 UTC (rev 1274)
@@ -35544,25 +35544,25 @@
      <thead>
       <tr>
        <th>Bytes in Hexadecimal
-       <th>Description
+       <th>Encoding
      <tbody>
 <!-- nobody uses this
       <tr>
        <td>00 00 FE FF
-       <td>UTF-32BE BOM
+       <td>UTF-32BE
       <tr>
        <td>FF FE 00 00
-       <td>UTF-32LE BOM
+       <td>UTF-32LE
 -->
       <tr>
        <td>FE FF
-       <td>UTF-16BE BOM
+       <td>UTF-16BE
       <tr>
        <td>FF FE
-       <td>UTF-16LE BOM
+       <td>UTF-16LE
       <tr>
        <td>EF BB BF
-       <td>UTF-8 BOM
+       <td>UTF-8
 <!-- nobody uses this
       <tr>
        <td>DD 73 66 73
@@ -35570,6 +35570,9 @@
 -->
     </table>
 
+   <p class="note">This step looks for Unicode Byte Order Marks
+   (BOMs).</p></li>
+
    <li><p>Otherwise, the user agent will have to search for explicit
    character encoding information in the file itself. This should
    proceed as follows:
@@ -35979,6 +35982,11 @@
   could not be converted to Unicode characters must be converted to
   U+FFFD REPLACEMENT CHARACTER code points.</p>
 
+  <p class="note">Bytes or sequences of bytes in the original byte
+  stream that did not conform to the encoding specification
+  (e.g. invalid UTF-8 byte sequences in a UTF-8 input stream) are
+  errors that conformance checkers are expected to report.</p>
+
   <p>One leading U+FEFF BYTE ORDER MARK character must be ignored if
   any are present.</p>