[html5] r5860 - [c] (0) Change the limit for where charsets should be given to the first 1024 by [...]

Tue Feb 8 16:02:07 PST 2011

Author: ianh
Date: 2011-02-08 16:02:05 -0800 (Tue, 08 Feb 2011)
New Revision: 5860

Modified:
   complete.html
   index
   source
Log:
[c] (0) Change the limit for where charsets should be given to the first 1024 bytes.
Fixing http://www.w3.org/Bugs/Public/show_bug.cgi?id=11426

Modified: complete.html
===================================================================

--- complete.html	2011-02-08 22:54:39 UTC (rev 5859)
+++ complete.html	2011-02-09 00:02:05 UTC (rev 5860)
@@ -14253,9 +14253,10 @@
    the use of <a href=#syntax-charref title=syntax-charref>character references</a>
    or character escapes of any kind.</li>
 
-   <li id=charset512>The element containing the character encoding
-   declaration must be serialized completely within the first 512
-   bytes of the document.</li>
+   <li id=charset1024><span id=charset512 title="">The element
+   containing the character encoding declaration must be serialized
+   completely within the first 1024 bytes of the document.</span></li>
+   <!-- span is for historical reasons, to keep an old ID alive -->
 
    <li>There can only be one character encoding declaration in the
    document.</li> <!-- conformance criteria for this one are given in
@@ -76963,17 +76964,27 @@
    supported, return that encoding with the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a>
    <i>certain</i>, and abort these steps.</li>
 
-   <li><p>The user agent may wait for more bytes of the resource to be
-   available, either in this step or at any later step in this
-   algorithm. For instance, a user agent might wait 500ms or 512
-   bytes, whichever came first. In general preparsing the source to
-   find the encoding improves performance, as it reduces the need to
-   throw away the data structures used when parsing upon finding the
-   encoding information. However, if the user agent delays too long to
-   obtain data to determine the encoding, then the cost of the delay
-   could outweigh any performance improvements from the
-   preparse.</li>
+   <li>
 
+    <p>The user agent may wait for more bytes of the resource to be
+    available, either in this step or at any later step in this
+    algorithm. For instance, a user agent might wait 500ms or 1024
+    bytes, whichever came first. In general preparsing the source to
+    find the encoding improves performance, as it reduces the need to
+    throw away the data structures used when parsing upon finding the
+    encoding information. However, if the user agent delays too long
+    to obtain data to determine the encoding, then the cost of the
+    delay could outweigh any performance improvements from the
+    preparse.</p>
+
+    <p class=note>The authoring conformance requirements for
+    character encoding declarations limit them to only appearing <a href=#charset1024>in the first 1024 bytes</a>. User agents are
+    therefore encouraged to use the preparse algorithm below (part of
+    these steps) on the first 1024 bytes, but not to stall beyond
+    that.</p>
+
+   </li>
+
    <li><p>For each of the rows in the following table, starting with
    the first one and going down, if there are as many or more bytes
    available than the number of bytes in the first column, and the

Modified: index
===================================================================
--- index	2011-02-08 22:54:39 UTC (rev 5859)
+++ index	2011-02-09 00:02:05 UTC (rev 5860)
@@ -14233,9 +14233,10 @@
    the use of <a href=#syntax-charref title=syntax-charref>character references</a>
    or character escapes of any kind.</li>
 
-   <li id=charset512>The element containing the character encoding
-   declaration must be serialized completely within the first 512
-   bytes of the document.</li>
+   <li id=charset1024><span id=charset512 title="">The element
+   containing the character encoding declaration must be serialized
+   completely within the first 1024 bytes of the document.</span></li>
+   <!-- span is for historical reasons, to keep an old ID alive -->
 
    <li>There can only be one character encoding declaration in the
    document.</li> <!-- conformance criteria for this one are given in
@@ -72934,17 +72935,27 @@
    supported, return that encoding with the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a>
    <i>certain</i>, and abort these steps.</li>
 
-   <li><p>The user agent may wait for more bytes of the resource to be
-   available, either in this step or at any later step in this
-   algorithm. For instance, a user agent might wait 500ms or 512
-   bytes, whichever came first. In general preparsing the source to
-   find the encoding improves performance, as it reduces the need to
-   throw away the data structures used when parsing upon finding the
-   encoding information. However, if the user agent delays too long to
-   obtain data to determine the encoding, then the cost of the delay
-   could outweigh any performance improvements from the
-   preparse.</li>
+   <li>
 
+    <p>The user agent may wait for more bytes of the resource to be
+    available, either in this step or at any later step in this
+    algorithm. For instance, a user agent might wait 500ms or 1024
+    bytes, whichever came first. In general preparsing the source to
+    find the encoding improves performance, as it reduces the need to
+    throw away the data structures used when parsing upon finding the
+    encoding information. However, if the user agent delays too long
+    to obtain data to determine the encoding, then the cost of the
+    delay could outweigh any performance improvements from the
+    preparse.</p>
+
+    <p class=note>The authoring conformance requirements for
+    character encoding declarations limit them to only appearing <a href=#charset1024>in the first 1024 bytes</a>. User agents are
+    therefore encouraged to use the preparse algorithm below (part of
+    these steps) on the first 1024 bytes, but not to stall beyond
+    that.</p>
+
+   </li>
+
    <li><p>For each of the rows in the following table, starting with
    the first one and going down, if there are as many or more bytes
    available than the number of bytes in the first column, and the

Modified: source
===================================================================
--- source	2011-02-08 22:54:39 UTC (rev 5859)
+++ source	2011-02-09 00:02:05 UTC (rev 5860)
@@ -15061,9 +15061,10 @@
    the use of <span title="syntax-charref">character references</span>
    or character escapes of any kind.</li>
 
-   <li id="charset512">The element containing the character encoding
-   declaration must be serialized completely within the first 512
-   bytes of the document.</li>
+   <li id="charset1024"><span title="" id="charset512">The element
+   containing the character encoding declaration must be serialized
+   completely within the first 1024 bytes of the document.</span></li>
+   <!-- span is for historical reasons, to keep an old ID alive -->
 
    <li>There can only be one character encoding declaration in the
    document.</li> <!-- conformance criteria for this one are given in
@@ -87094,17 +87095,28 @@
    title="concept-encoding-confidence">confidence</span>
    <i>certain</i>, and abort these steps.</p></li>
 
-   <li><p>The user agent may wait for more bytes of the resource to be
-   available, either in this step or at any later step in this
-   algorithm. For instance, a user agent might wait 500ms or 512
-   bytes, whichever came first. In general preparsing the source to
-   find the encoding improves performance, as it reduces the need to
-   throw away the data structures used when parsing upon finding the
-   encoding information. However, if the user agent delays too long to
-   obtain data to determine the encoding, then the cost of the delay
-   could outweigh any performance improvements from the
-   preparse.</p></li>
+   <li>
 
+    <p>The user agent may wait for more bytes of the resource to be
+    available, either in this step or at any later step in this
+    algorithm. For instance, a user agent might wait 500ms or 1024
+    bytes, whichever came first. In general preparsing the source to
+    find the encoding improves performance, as it reduces the need to
+    throw away the data structures used when parsing upon finding the
+    encoding information. However, if the user agent delays too long
+    to obtain data to determine the encoding, then the cost of the
+    delay could outweigh any performance improvements from the
+    preparse.</p>
+
+    <p class="note">The authoring conformance requirements for
+    character encoding declarations limit them to only appearing <a
+    href="#charset1024">in the first 1024 bytes</a>. User agents are
+    therefore encouraged to use the preparse algorithm below (part of
+    these steps) on the first 1024 bytes, but not to stall beyond
+    that.</p>
+
+   </li>
+
    <li><p>For each of the rows in the following table, starting with
    the first one and going down, if there are as many or more bytes
    available than the number of bytes in the first column, and the