[html5] r2802 - [] (0) Support BOMs in <script src=''> JS files. (credit: mp)

Thu Feb 12 02:46:18 PST 2009

Author: ianh
Date: 2009-02-12 02:46:18 -0800 (Thu, 12 Feb 2009)
New Revision: 2802

Modified:
   index
   source
Log:
[] (0) Support BOMs in <script src=''> JS files. (credit: mp)

Modified: index
===================================================================

--- index	2009-02-12 10:36:07 UTC (rev 2801)
+++ index	2009-02-12 10:46:18 UTC (rev 2802)
@@ -5061,6 +5061,7 @@
     <p>If <var title="">n</var> is 4 or more, and the first bytes of
     the resource match one of the following byte sets:</p>
 
+    <!-- this table is present in several forms in this file; keep them in sync -->
     <table><thead><tr><th>Bytes in Hexadecimal
        <th>Description
      <tbody><tr><td>FE FF
@@ -10288,8 +10289,40 @@
         <p>The contents of that file, interpreted as string of
         Unicode characters, are the script source.</p>
 
-        <p>The file must be converted to Unicode using the character
-        encoding given by <var><a href="#the-script-block's-character-encoding">the script block's character
+        <p>For each of the rows in the following table, starting with
+        the first one and going down, if the file has as many or more
+        bytes available than the number of bytes in the first column,
+        and the first bytes of the file match the bytes given in the
+        first column, then set <var><a href="#the-script-block's-character-encoding">the script block's character
+        encoding</a></var> to the encoding given in the cell in the second
+        column of that row, irrespective of any previous value:</p>
+
+        <!-- this table is present in several forms in this file; keep them in sync -->
+        <table><thead><tr><th>Bytes in Hexadecimal
+           <th>Encoding
+         <tbody><!-- nobody uses this
+          <tr>
+           <td>00 00 FE FF
+           <td>UTF-32BE
+          <tr>
+           <td>FF FE 00 00
+           <td>UTF-32LE
+--><tr><td>FE FF
+           <td>UTF-16BE
+          <tr><td>FF FE
+           <td>UTF-16LE
+          <tr><td>EF BB BF
+           <td>UTF-8
+<!-- nobody uses this
+          <tr>
+           <td>DD 73 66 73
+           <td>UTF-EBCDIC
+-->
+        </table><p class=note>This step looks for Unicode Byte Order Marks
+        (BOMs).</p>
+
+        <p>The file must then be converted to Unicode using the
+        character encoding given by <var><a href="#the-script-block's-character-encoding">the script block's character
         encoding</a></var>.</p>
 
        </dd>
@@ -47971,6 +48004,7 @@
    that row, with the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a>
    <i>certain</i>, and abort these steps:</p>
 
+    <!-- this table is present in several forms in this file; keep them in sync -->
     <table><thead><tr><th>Bytes in Hexadecimal
        <th>Encoding
      <tbody><!-- nobody uses this

Modified: source
===================================================================
--- source	2009-02-12 10:36:07 UTC (rev 2801)
+++ source	2009-02-12 10:46:18 UTC (rev 2802)
@@ -4749,6 +4749,7 @@
     <p>If <var title="">n</var> is 4 or more, and the first bytes of
     the resource match one of the following byte sets:</p>
 
+    <!-- this table is present in several forms in this file; keep them in sync -->
     <table>
      <thead>
       <tr>
@@ -10831,8 +10832,50 @@
         <p>The contents of that file, interpreted as string of
         Unicode characters, are the script source.</p>
 
-        <p>The file must be converted to Unicode using the character
-        encoding given by <var>the script block's character
+        <p>For each of the rows in the following table, starting with
+        the first one and going down, if the file has as many or more
+        bytes available than the number of bytes in the first column,
+        and the first bytes of the file match the bytes given in the
+        first column, then set <var>the script block's character
+        encoding</var> to the encoding given in the cell in the second
+        column of that row, irrespective of any previous value:</p>
+
+        <!-- this table is present in several forms in this file; keep them in sync -->
+        <table>
+         <thead>
+          <tr>
+           <th>Bytes in Hexadecimal
+           <th>Encoding
+         <tbody>
+<!-- nobody uses this
+          <tr>
+           <td>00 00 FE FF
+           <td>UTF-32BE
+          <tr>
+           <td>FF FE 00 00
+           <td>UTF-32LE
+-->
+          <tr>
+           <td>FE FF
+           <td>UTF-16BE
+          <tr>
+           <td>FF FE
+           <td>UTF-16LE
+          <tr>
+           <td>EF BB BF
+           <td>UTF-8
+<!-- nobody uses this
+          <tr>
+           <td>DD 73 66 73
+           <td>UTF-EBCDIC
+-->
+        </table>
+
+        <p class="note">This step looks for Unicode Byte Order Marks
+        (BOMs).</p>
+
+        <p>The file must then be converted to Unicode using the
+        character encoding given by <var>the script block's character
         encoding</var>.</p>
 
        </dd>
@@ -54791,6 +54834,7 @@
    title="concept-encoding-confidence">confidence</span>
    <i>certain</i>, and abort these steps:</p>
 
+    <!-- this table is present in several forms in this file; keep them in sync -->
     <table>
      <thead>
       <tr>