[html5] r1016 - /

Tue Aug 21 17:11:21 PDT 2007

Author: ianh
Date: 2007-08-21 17:11:21 -0700 (Tue, 21 Aug 2007)
New Revision: 1016

Modified:
   index
   source
Log:
[] (0) Make the patterns in the sniffing section support whitespace instead of having asterisks. Also add some notes in other parts of teh spec.

Modified: index
===================================================================

--- index	2007-08-21 07:37:43 UTC (rev 1015)
+++ index	2007-08-22 00:11:21 UTC (rev 1016)
@@ -22,7 +22,7 @@
 
    <h1 id=html-5>HTML 5</h1>
 
-   <h2 class="no-num no-toc" id=working>Working Draft — 21 August 2007</h2>
+   <h2 class="no-num no-toc" id=working>Working Draft — 22 August 2007</h2>
 
    <p>You can take part in this work. <a
     href="http://www.whatwg.org/mailing-list">Join the working group's
@@ -3431,6 +3431,9 @@
    that have all the classes specified in that array. If the array is empty,
    then the method must return an empty <code>NodeList</code>.
 
+  <p class=big-issue>getElementsByClassName() will probably be changed to
+   take a space-separated list of tokens as a single string, not an array.
+
   <p>HTML, XHTML, SVG and MathML elements define which classes they are in by
    having an attribute in the per-element partition with the name <code
    title="">class</code> containing a space-separated list of classes to
@@ -26154,7 +26157,11 @@
     </table>
 
     <p>...then the sniffed type of the resource is "text/plain".
+  </ol>
 
+  <p class=big-issue>Should we remove UTF-32 from the above?
+
+  <ul>
    <li>
     <p>Otherwise, if any of the first <var title="">n</var> bytes of the
      resource are in one of the following byte ranges:</p>
@@ -26173,11 +26180,14 @@
     </ul>
 
     <p>...then the sniffed type of the resource is
-     "application/octet-stream".
+     "application/octet-stream".</p>
 
+    <p class=big-issue>maybe we should invoke the "Content-Type sniffing:
+     image" section now, falling back on "application/octet-stream".</p>
+
    <li>
     <p>Otherwise, the sniffed type of the resource is "text/plain".
-  </ol>
+  </ul>
 
   <h4 id=content-type1><span class=secno>4.7.2. </span><dfn
    id=content-type5>Content-Type sniffing: unknown type</dfn></h4>
@@ -26195,7 +26205,7 @@
     <p>For each row in the table below:</p>
 
     <dl class=switch>
-     <dt>If the row has no bytes with a trailing asterisk:
+     <dt>If the row has no "<em>WS</em>" bytes:
 
      <dd>
       <ol>
@@ -26217,7 +26227,7 @@
         steps.
       </ol>
 
-     <dt>If the row has an asterisk after one of the bytes:
+     <dt>If the row has a "<em>WS</em>" byte:
 
      <dd>
       <ol>
@@ -26240,7 +26250,7 @@
 
         <dl class=switch>
          <dt>If the <var title="">index<sub>stream</sub></var>th byte of the
-          pattern does not have an asterisk after:
+          pattern is a normal hexadecimal byte and not a "<em>WS</em>" byte:
 
          <dd>
           <p>If the "and" operator, applied to the <var
@@ -26256,21 +26266,21 @@
            stream.</p>
 
          <dt>Otherwies, if the <var title="">index<sub>stream</sub></var>th
-          byte of the pattern <em>does</em> have an asterisk after it:
+          byte of the pattern is a "<em>WS</em>" byte:
 
          <dd>
-          <p>If the "and" operator, applied to the <var
-           title="">index<sub>stream</sub></var>th byte of the stream and the
-           <var title="">index<sub>pattern</sub></var>th byte of the mask,
-           yield a value different that the <var
-           title="">index<sub>pattern</sub></var>th byte of the pattern, then
-           increment only the <var title="">index<sub>pattern</sub></var> to
-           the next byte in the mask and pattern and jump back to the
-           <em>loop</em> step in this algorithm.</p>
+          <p>"<em>WS</em>" means "whitespace", and allows insignificant
+           whitespace to be skipped when sniffing for a type signature.</p>
 
+          <p>If the <var title="">index<sub>stream</sub></var>th byte of the
+           stream is one of 0x09 (ASCII TAB), 0x0A (ASCII LF), 0x0B (ASCII
+           VT), 0x0C (ASCII FF), 0x0D (ASCII CR), or 0x20 (ASCII space), then
+           increment only the <var title="">index<sub>stream</sub></var> to
+           the next byte in the byte stream.</p>
+
           <p>Otherwise, increment only the <var
-           title="">index<sub>stream</sub></var> to the next byte in the byte
-           stream.</p>
+           title="">index<sub>pattern</sub></var> to the next byte in the
+           mask and pattern.</p>
         </dl>
 
        <li>
@@ -26317,9 +26327,10 @@
       compatible encodings, case-insensitively.
 
     <tr>
-     <td>FF* FF DF DF DF DF
+     <td>FF FF DF DF DF DF
 
-     <td>20* 3C 48 54 4D 4C <!-- "<HTML" --> <!-- common in static data -->
+     <td><em>WS</em> 3C 48 54 4D 4C <!-- "<HTML" -->
+      <!-- common in static data -->
 
      <td>text/html
 
@@ -26328,9 +26339,10 @@
       
 
     <tr>
-     <td>FF* FF DF DF DF DF
+     <td>FF FF DF DF DF DF
 
-     <td>20* 3C 48 45 41 44 <!-- "<HEAD" --> <!-- common in static data -->
+     <td><em>WS</em> 3C 48 45 41 44 <!-- "<HEAD" -->
+      <!-- common in static data -->
 
      <td>text/html
 
@@ -26339,9 +26351,9 @@
       
 
     <tr>
-     <td>FF* FF DF DF DF DF DF DF
+     <td>FF FF DF DF DF DF DF DF
 
-     <td>20* 3C 53 43 52 49 50 54 <!-- "<SCRIPT" -->
+     <td><em>WS</em> 3C 53 43 52 49 50 54 <!-- "<SCRIPT" -->
       <!-- common in dynamic data -->
 
      <td>text/html

Modified: source
===================================================================
--- source	2007-08-21 07:37:43 UTC (rev 1015)
+++ source	2007-08-22 00:11:21 UTC (rev 1016)
@@ -1960,6 +1960,10 @@
   array is empty, then the method must return an empty
   <code>NodeList</code>.</p>
 
+  <p class="big-issue">getElementsByClassName() will probably be
+  changed to take a space-separated list of tokens as a single string,
+  not an array.</p>
+
   <p>HTML, XHTML, SVG and MathML elements define which classes they
   are in by having an attribute in the per-element partition with the
   name <code title="">class</code> containing a space-separated list
@@ -23662,6 +23666,8 @@
 
    <p>...then the sniffed type of the resource is "text/plain".</p></li>
 
+   <p class="big-issue">Should we remove UTF-32 from the above?</p>
+
    <li><p>Otherwise, if any of the first <var title="">n</var> bytes
    of the resource are in one of the following byte ranges:</p>
 
@@ -23678,8 +23684,14 @@
     </ul>
 
    <p>...then the sniffed type of the resource is
-   "application/octet-stream".</p></li>
+   "application/octet-stream".</p>
 
+   <p class="big-issue">maybe we should invoke the "Content-Type
+   sniffing: image" section now, falling back on
+   "application/octet-stream".</p>
+
+   </li>
+
    <li><p>Otherwise, the sniffed type of the resource is
    "text/plain".</p></li>
 
@@ -23700,7 +23712,7 @@
 
     <dl class="switch">
 
-     <dt>If the row has no bytes with a trailing asterisk:</dt>
+     <dt>If the row has no "<em>WS</em>" bytes:</dt>
 
      <dd>
 
@@ -23727,7 +23739,7 @@
 
      </dd>
 
-     <dt>If the row has an asterisk after one of the bytes:</dt>
+     <dt>If the row has a "<em>WS</em>" byte:</dt>
 
      <dd>
 
@@ -23752,8 +23764,9 @@
 
         <dl class="switch">
 
-         <dt>If the <var title="">index<sub>stream</sub></var>th byte of
-         the pattern does not have an asterisk after:</dt>
+         <dt>If the <var title="">index<sub>stream</sub></var>th byte
+         of the pattern is a normal hexadecimal byte and not a "<em>WS</em>"
+         byte:</dt>
 
          <dd>
 
@@ -23774,24 +23787,25 @@
 
          <dt>Otherwies, if the <var
          title="">index<sub>stream</sub></var>th byte of the pattern
-         <em>does</em> have an asterisk after it:</dt>
+         is a "<em>WS</em>" byte:</dt>
 
          <dd>
 
-          <p>If the "and" operator, applied to the <var
-          title="">index<sub>stream</sub></var>th byte of the stream
-          and the <var title="">index<sub>pattern</sub></var>th byte
-          of the mask, yield a value different that the <var
-          title="">index<sub>pattern</sub></var>th byte of the
-          pattern, then increment only the <var
-          title="">index<sub>pattern</sub></var> to the next byte in
-          the mask and pattern and jump back to the <em>loop</em> step
-          in this algorithm.</p>
+          <p>"<em>WS</em>" means "whitespace", and allows insignificant
+          whitespace to be skipped when sniffing for a type
+          signature.</p>
 
-          <p>Otherwise, increment only the <var
+          <p>If the <var title="">index<sub>stream</sub></var>th byte
+          of the stream is one of 0x09 (ASCII TAB), 0x0A (ASCII LF),
+          0x0B (ASCII VT), 0x0C (ASCII FF), 0x0D (ASCII CR), or 0x20
+          (ASCII space), then increment only the <var
           title="">index<sub>stream</sub></var> to the next byte in
           the byte stream.</p>
 
+          <p>Otherwise, increment only the <var
+          title="">index<sub>pattern</sub></var> to the next byte in
+          the mask and pattern.</p>
+
          </dd>
 
         </dl>
@@ -23836,18 +23850,18 @@
      <td>text/html
      <td>The string "<code title=""><!DOCTYPE HTML</code>" in US-ASCII or compatible encodings, case-insensitively.
     <tr>
-     <td>FF* FF DF DF DF DF
-     <td>20* 3C 48 54 4D 4C <!-- "<HTML" --> <!-- common in static data -->
+     <td>FF FF DF DF DF DF
+     <td><em>WS</em> 3C 48 54 4D 4C <!-- "<HTML" --> <!-- common in static data -->
      <td>text/html
      <td>The string "<code title=""><HTML</code>" in US-ASCII or compatible encodings, case-insensitively, possibly with leading spaces.
     <tr>
-     <td>FF* FF DF DF DF DF
-     <td>20* 3C 48 45 41 44 <!-- "<HEAD" --> <!-- common in static data -->
+     <td>FF FF DF DF DF DF
+     <td><em>WS</em> 3C 48 45 41 44 <!-- "<HEAD" --> <!-- common in static data -->
      <td>text/html
      <td>The string "<code title=""><HEAD</code>" in US-ASCII or compatible encodings, case-insensitively, possibly with leading spaces.
     <tr>
-     <td>FF* FF DF DF DF DF DF DF
-     <td>20* 3C 53 43 52 49 50 54 <!-- "<SCRIPT" --> <!-- common in dynamic data -->
+     <td>FF FF DF DF DF DF DF DF
+     <td><em>WS</em> 3C 53 43 52 49 50 54 <!-- "<SCRIPT" --> <!-- common in dynamic data -->
      <td>text/html
      <td>The string "<code title=""><SCRIPT</code>" in US-ASCII or compatible encodings, case-insensitively, possibly with leading spaces.
     <tr>