[html5] r6992 - [e] (0) Move a section so that the character encoding requirements are closer to [...]

whatwg at whatwg.org whatwg at whatwg.org
Mon Feb 13 14:50:12 PST 2012


Author: ianh
Date: 2012-02-13 14:50:10 -0800 (Mon, 13 Feb 2012)
New Revision: 6992

Modified:
   complete.html
   index
   source
Log:
[e] (0) Move a section so that the character encoding requirements are closer together.
Affected topics: HTML Syntax and Parsing

Modified: complete.html
===================================================================
--- complete.html	2012-02-13 22:48:10 UTC (rev 6991)
+++ complete.html	2012-02-13 22:50:10 UTC (rev 6992)
@@ -1119,8 +1119,8 @@
       <ol>
        <li><a href=#determining-the-character-encoding><span class=secno>12.2.2.1 </span>Determining the character encoding</a></li>
        <li><a href=#character-encodings-0><span class=secno>12.2.2.2 </span>Character encodings</a></li>
-       <li><a href=#preprocessing-the-input-stream><span class=secno>12.2.2.3 </span>Preprocessing the input stream</a></li>
-       <li><a href=#changing-the-encoding-while-parsing><span class=secno>12.2.2.4 </span>Changing the encoding while parsing</a></ol></li>
+       <li><a href=#changing-the-encoding-while-parsing><span class=secno>12.2.2.3 </span>Changing the encoding while parsing</a></li>
+       <li><a href=#preprocessing-the-input-stream><span class=secno>12.2.2.4 </span>Preprocessing the input stream</a></ol></li>
      <li><a href=#parse-state><span class=secno>12.2.3 </span>Parse state</a>
       <ol>
        <li><a href=#the-insertion-mode><span class=secno>12.2.3.1 </span>The insertion mode</a></li>
@@ -81878,8 +81878,60 @@
 
 
 
-  <h5 id=preprocessing-the-input-stream><span class=secno>12.2.2.3 </span>Preprocessing the input stream</h5>
+  <h5 id=changing-the-encoding-while-parsing><span class=secno>12.2.2.3 </span>Changing the encoding while parsing</h5>
 
+  <p>When the parser requires the user agent to <dfn id=change-the-encoding>change the
+  encoding</dfn>, it must run the following steps. This might happen
+  if the <a href=#encoding-sniffing-algorithm>encoding sniffing algorithm</a> described above
+  failed to find an encoding, or if it found an encoding that was not
+  the actual encoding of the file.</p>
+
+  <ol><li>If the encoding that is already being used to interpret the
+   input stream is <a href=#a-utf-16-encoding>a UTF-16 encoding</a>, then set the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
+   <i>certain</i> and abort these steps. The new encoding is ignored;
+   if it was anything but the same encoding, then it would be clearly
+   incorrect.</li>
+
+   <li>If the new encoding is <a href=#a-utf-16-encoding>a UTF-16 encoding</a>, change
+   it to UTF-8.</li>
+
+   <li>If the new encoding is identical or equivalent to the encoding
+   that is already being used to interpret the input stream, then set
+   the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
+   <i>certain</i> and abort these steps. This happens when the
+   encoding information found in the file matches what the
+   <a href=#encoding-sniffing-algorithm>encoding sniffing algorithm</a> determined to be the
+   encoding, and in the second pass through the parser if the first
+   pass found that the encoding sniffing algorithm described in the
+   earlier section failed to find the right encoding.</li>
+
+   <li>If all the bytes up to the last byte converted by the current
+   decoder have the same Unicode interpretations in both the current
+   encoding and the new encoding, and if the user agent supports
+   changing the converter on the fly, then the user agent may change
+   to the new converter for the encoding on the fly. Set the
+   <a href="#document's-character-encoding">document's character encoding</a> and the encoding used to
+   convert the input stream to the new encoding, set the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
+   <i>certain</i>, and abort these steps.</li>
+
+   <li>Otherwise, <a href=#navigate>navigate</a><!--DONAV reparse--> to the
+   document again, with <a href=#replacement-enabled>replacement enabled</a>, and using
+   the same <a href=#source-browsing-context>source browsing context</a>, but this time skip
+   the <a href=#encoding-sniffing-algorithm>encoding sniffing algorithm</a> and instead just set
+   the encoding to the new encoding and the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
+   <i>certain</i>. Whenever possible, this should be done without
+   actually contacting the network layer (the bytes should be
+   re-parsed from memory), even if, e.g., the document is marked as
+   not being cacheable. If this is not possible and contacting the
+   network layer would involve repeating a request that uses a method
+   other than HTTP GET (<a href=#concept-http-equivalent-get title=concept-http-equivalent-get>or
+   equivalent</a> for non-HTTP URLs), then instead set the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
+   <i>certain</i> and ignore the new encoding. The resource will be
+   misinterpreted. User agents may notify the user of the situation,
+   to aid in application development.</li>
+
+  </ol><h5 id=preprocessing-the-input-stream><span class=secno>12.2.2.4 </span>Preprocessing the input stream</h5>
+
   <p>The <dfn id=input-stream>input stream</dfn> consists of the characters pushed
   into it as the <a href=#the-input-byte-stream>input byte stream</a> is decoded or from the
   various APIs that directly manipulate the input stream.</p>
@@ -81936,62 +81988,9 @@
   consumed. Otherwise, the "EOF" character is not a real character in
   the stream, but rather the lack of any further characters.</p>
 
+  </div>
 
-  <h5 id=changing-the-encoding-while-parsing><span class=secno>12.2.2.4 </span>Changing the encoding while parsing</h5>
 
-  <p>When the parser requires the user agent to <dfn id=change-the-encoding>change the
-  encoding</dfn>, it must run the following steps. This might happen
-  if the <a href=#encoding-sniffing-algorithm>encoding sniffing algorithm</a> described above
-  failed to find an encoding, or if it found an encoding that was not
-  the actual encoding of the file.</p>
-
-  <ol><li>If the encoding that is already being used to interpret the
-   input stream is <a href=#a-utf-16-encoding>a UTF-16 encoding</a>, then set the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
-   <i>certain</i> and abort these steps. The new encoding is ignored;
-   if it was anything but the same encoding, then it would be clearly
-   incorrect.</li>
-
-   <li>If the new encoding is <a href=#a-utf-16-encoding>a UTF-16 encoding</a>, change
-   it to UTF-8.</li>
-
-   <li>If the new encoding is identical or equivalent to the encoding
-   that is already being used to interpret the input stream, then set
-   the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
-   <i>certain</i> and abort these steps. This happens when the
-   encoding information found in the file matches what the
-   <a href=#encoding-sniffing-algorithm>encoding sniffing algorithm</a> determined to be the
-   encoding, and in the second pass through the parser if the first
-   pass found that the encoding sniffing algorithm described in the
-   earlier section failed to find the right encoding.</li>
-
-   <li>If all the bytes up to the last byte converted by the current
-   decoder have the same Unicode interpretations in both the current
-   encoding and the new encoding, and if the user agent supports
-   changing the converter on the fly, then the user agent may change
-   to the new converter for the encoding on the fly. Set the
-   <a href="#document's-character-encoding">document's character encoding</a> and the encoding used to
-   convert the input stream to the new encoding, set the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
-   <i>certain</i>, and abort these steps.</li>
-
-   <li>Otherwise, <a href=#navigate>navigate</a><!--DONAV reparse--> to the
-   document again, with <a href=#replacement-enabled>replacement enabled</a>, and using
-   the same <a href=#source-browsing-context>source browsing context</a>, but this time skip
-   the <a href=#encoding-sniffing-algorithm>encoding sniffing algorithm</a> and instead just set
-   the encoding to the new encoding and the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
-   <i>certain</i>. Whenever possible, this should be done without
-   actually contacting the network layer (the bytes should be
-   re-parsed from memory), even if, e.g., the document is marked as
-   not being cacheable. If this is not possible and contacting the
-   network layer would involve repeating a request that uses a method
-   other than HTTP GET (<a href=#concept-http-equivalent-get title=concept-http-equivalent-get>or
-   equivalent</a> for non-HTTP URLs), then instead set the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
-   <i>certain</i> and ignore the new encoding. The resource will be
-   misinterpreted. User agents may notify the user of the situation,
-   to aid in application development.</li>
-
-  </ol></div>
-
-
   <div class=impl>
 
   <h4 id=parse-state><span class=secno>12.2.3 </span>Parse state</h4>

Modified: index
===================================================================
--- index	2012-02-13 22:48:10 UTC (rev 6991)
+++ index	2012-02-13 22:50:10 UTC (rev 6992)
@@ -1119,8 +1119,8 @@
       <ol>
        <li><a href=#determining-the-character-encoding><span class=secno>12.2.2.1 </span>Determining the character encoding</a></li>
        <li><a href=#character-encodings-0><span class=secno>12.2.2.2 </span>Character encodings</a></li>
-       <li><a href=#preprocessing-the-input-stream><span class=secno>12.2.2.3 </span>Preprocessing the input stream</a></li>
-       <li><a href=#changing-the-encoding-while-parsing><span class=secno>12.2.2.4 </span>Changing the encoding while parsing</a></ol></li>
+       <li><a href=#changing-the-encoding-while-parsing><span class=secno>12.2.2.3 </span>Changing the encoding while parsing</a></li>
+       <li><a href=#preprocessing-the-input-stream><span class=secno>12.2.2.4 </span>Preprocessing the input stream</a></ol></li>
      <li><a href=#parse-state><span class=secno>12.2.3 </span>Parse state</a>
       <ol>
        <li><a href=#the-insertion-mode><span class=secno>12.2.3.1 </span>The insertion mode</a></li>
@@ -81878,8 +81878,60 @@
 
 
 
-  <h5 id=preprocessing-the-input-stream><span class=secno>12.2.2.3 </span>Preprocessing the input stream</h5>
+  <h5 id=changing-the-encoding-while-parsing><span class=secno>12.2.2.3 </span>Changing the encoding while parsing</h5>
 
+  <p>When the parser requires the user agent to <dfn id=change-the-encoding>change the
+  encoding</dfn>, it must run the following steps. This might happen
+  if the <a href=#encoding-sniffing-algorithm>encoding sniffing algorithm</a> described above
+  failed to find an encoding, or if it found an encoding that was not
+  the actual encoding of the file.</p>
+
+  <ol><li>If the encoding that is already being used to interpret the
+   input stream is <a href=#a-utf-16-encoding>a UTF-16 encoding</a>, then set the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
+   <i>certain</i> and abort these steps. The new encoding is ignored;
+   if it was anything but the same encoding, then it would be clearly
+   incorrect.</li>
+
+   <li>If the new encoding is <a href=#a-utf-16-encoding>a UTF-16 encoding</a>, change
+   it to UTF-8.</li>
+
+   <li>If the new encoding is identical or equivalent to the encoding
+   that is already being used to interpret the input stream, then set
+   the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
+   <i>certain</i> and abort these steps. This happens when the
+   encoding information found in the file matches what the
+   <a href=#encoding-sniffing-algorithm>encoding sniffing algorithm</a> determined to be the
+   encoding, and in the second pass through the parser if the first
+   pass found that the encoding sniffing algorithm described in the
+   earlier section failed to find the right encoding.</li>
+
+   <li>If all the bytes up to the last byte converted by the current
+   decoder have the same Unicode interpretations in both the current
+   encoding and the new encoding, and if the user agent supports
+   changing the converter on the fly, then the user agent may change
+   to the new converter for the encoding on the fly. Set the
+   <a href="#document's-character-encoding">document's character encoding</a> and the encoding used to
+   convert the input stream to the new encoding, set the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
+   <i>certain</i>, and abort these steps.</li>
+
+   <li>Otherwise, <a href=#navigate>navigate</a><!--DONAV reparse--> to the
+   document again, with <a href=#replacement-enabled>replacement enabled</a>, and using
+   the same <a href=#source-browsing-context>source browsing context</a>, but this time skip
+   the <a href=#encoding-sniffing-algorithm>encoding sniffing algorithm</a> and instead just set
+   the encoding to the new encoding and the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
+   <i>certain</i>. Whenever possible, this should be done without
+   actually contacting the network layer (the bytes should be
+   re-parsed from memory), even if, e.g., the document is marked as
+   not being cacheable. If this is not possible and contacting the
+   network layer would involve repeating a request that uses a method
+   other than HTTP GET (<a href=#concept-http-equivalent-get title=concept-http-equivalent-get>or
+   equivalent</a> for non-HTTP URLs), then instead set the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
+   <i>certain</i> and ignore the new encoding. The resource will be
+   misinterpreted. User agents may notify the user of the situation,
+   to aid in application development.</li>
+
+  </ol><h5 id=preprocessing-the-input-stream><span class=secno>12.2.2.4 </span>Preprocessing the input stream</h5>
+
   <p>The <dfn id=input-stream>input stream</dfn> consists of the characters pushed
   into it as the <a href=#the-input-byte-stream>input byte stream</a> is decoded or from the
   various APIs that directly manipulate the input stream.</p>
@@ -81936,62 +81988,9 @@
   consumed. Otherwise, the "EOF" character is not a real character in
   the stream, but rather the lack of any further characters.</p>
 
+  </div>
 
-  <h5 id=changing-the-encoding-while-parsing><span class=secno>12.2.2.4 </span>Changing the encoding while parsing</h5>
 
-  <p>When the parser requires the user agent to <dfn id=change-the-encoding>change the
-  encoding</dfn>, it must run the following steps. This might happen
-  if the <a href=#encoding-sniffing-algorithm>encoding sniffing algorithm</a> described above
-  failed to find an encoding, or if it found an encoding that was not
-  the actual encoding of the file.</p>
-
-  <ol><li>If the encoding that is already being used to interpret the
-   input stream is <a href=#a-utf-16-encoding>a UTF-16 encoding</a>, then set the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
-   <i>certain</i> and abort these steps. The new encoding is ignored;
-   if it was anything but the same encoding, then it would be clearly
-   incorrect.</li>
-
-   <li>If the new encoding is <a href=#a-utf-16-encoding>a UTF-16 encoding</a>, change
-   it to UTF-8.</li>
-
-   <li>If the new encoding is identical or equivalent to the encoding
-   that is already being used to interpret the input stream, then set
-   the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
-   <i>certain</i> and abort these steps. This happens when the
-   encoding information found in the file matches what the
-   <a href=#encoding-sniffing-algorithm>encoding sniffing algorithm</a> determined to be the
-   encoding, and in the second pass through the parser if the first
-   pass found that the encoding sniffing algorithm described in the
-   earlier section failed to find the right encoding.</li>
-
-   <li>If all the bytes up to the last byte converted by the current
-   decoder have the same Unicode interpretations in both the current
-   encoding and the new encoding, and if the user agent supports
-   changing the converter on the fly, then the user agent may change
-   to the new converter for the encoding on the fly. Set the
-   <a href="#document's-character-encoding">document's character encoding</a> and the encoding used to
-   convert the input stream to the new encoding, set the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
-   <i>certain</i>, and abort these steps.</li>
-
-   <li>Otherwise, <a href=#navigate>navigate</a><!--DONAV reparse--> to the
-   document again, with <a href=#replacement-enabled>replacement enabled</a>, and using
-   the same <a href=#source-browsing-context>source browsing context</a>, but this time skip
-   the <a href=#encoding-sniffing-algorithm>encoding sniffing algorithm</a> and instead just set
-   the encoding to the new encoding and the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
-   <i>certain</i>. Whenever possible, this should be done without
-   actually contacting the network layer (the bytes should be
-   re-parsed from memory), even if, e.g., the document is marked as
-   not being cacheable. If this is not possible and contacting the
-   network layer would involve repeating a request that uses a method
-   other than HTTP GET (<a href=#concept-http-equivalent-get title=concept-http-equivalent-get>or
-   equivalent</a> for non-HTTP URLs), then instead set the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a> to
-   <i>certain</i> and ignore the new encoding. The resource will be
-   misinterpreted. User agents may notify the user of the situation,
-   to aid in application development.</li>
-
-  </ol></div>
-
-
   <div class=impl>
 
   <h4 id=parse-state><span class=secno>12.2.3 </span>Parse state</h4>

Modified: source
===================================================================
--- source	2012-02-13 22:48:10 UTC (rev 6991)
+++ source	2012-02-13 22:50:10 UTC (rev 6992)
@@ -94997,6 +94997,67 @@
 
 
 
+  <h5>Changing the encoding while parsing</h5>
+
+  <p>When the parser requires the user agent to <dfn>change the
+  encoding</dfn>, it must run the following steps. This might happen
+  if the <span>encoding sniffing algorithm</span> described above
+  failed to find an encoding, or if it found an encoding that was not
+  the actual encoding of the file.</p>
+
+  <ol>
+
+   <li>If the encoding that is already being used to interpret the
+   input stream is <span>a UTF-16 encoding</span>, then set the <span
+   title="concept-encoding-confidence">confidence</span> to
+   <i>certain</i> and abort these steps. The new encoding is ignored;
+   if it was anything but the same encoding, then it would be clearly
+   incorrect.</li>
+
+   <li>If the new encoding is <span>a UTF-16 encoding</span>, change
+   it to UTF-8.</li>
+
+   <li>If the new encoding is identical or equivalent to the encoding
+   that is already being used to interpret the input stream, then set
+   the <span title="concept-encoding-confidence">confidence</span> to
+   <i>certain</i> and abort these steps. This happens when the
+   encoding information found in the file matches what the
+   <span>encoding sniffing algorithm</span> determined to be the
+   encoding, and in the second pass through the parser if the first
+   pass found that the encoding sniffing algorithm described in the
+   earlier section failed to find the right encoding.</li>
+
+   <li>If all the bytes up to the last byte converted by the current
+   decoder have the same Unicode interpretations in both the current
+   encoding and the new encoding, and if the user agent supports
+   changing the converter on the fly, then the user agent may change
+   to the new converter for the encoding on the fly. Set the
+   <span>document's character encoding</span> and the encoding used to
+   convert the input stream to the new encoding, set the <span
+   title="concept-encoding-confidence">confidence</span> to
+   <i>certain</i>, and abort these steps.</li>
+
+   <li>Otherwise, <span>navigate</span><!--DONAV reparse--> to the
+   document again, with <span>replacement enabled</span>, and using
+   the same <span>source browsing context</span>, but this time skip
+   the <span>encoding sniffing algorithm</span> and instead just set
+   the encoding to the new encoding and the <span
+   title="concept-encoding-confidence">confidence</span> to
+   <i>certain</i>. Whenever possible, this should be done without
+   actually contacting the network layer (the bytes should be
+   re-parsed from memory), even if, e.g., the document is marked as
+   not being cacheable. If this is not possible and contacting the
+   network layer would involve repeating a request that uses a method
+   other than HTTP GET (<span title="concept-http-equivalent-get">or
+   equivalent</span> for non-HTTP URLs), then instead set the <span
+   title="concept-encoding-confidence">confidence</span> to
+   <i>certain</i> and ignore the new encoding. The resource will be
+   misinterpreted. User agents may notify the user of the situation,
+   to aid in application development.</li>
+
+  </ol>
+
+
   <h5>Preprocessing the input stream</h5>
 
   <p>The <dfn>input stream</dfn> consists of the characters pushed
@@ -95057,67 +95118,6 @@
   consumed. Otherwise, the "EOF" character is not a real character in
   the stream, but rather the lack of any further characters.</p>
 
-
-  <h5>Changing the encoding while parsing</h5>
-
-  <p>When the parser requires the user agent to <dfn>change the
-  encoding</dfn>, it must run the following steps. This might happen
-  if the <span>encoding sniffing algorithm</span> described above
-  failed to find an encoding, or if it found an encoding that was not
-  the actual encoding of the file.</p>
-
-  <ol>
-
-   <li>If the encoding that is already being used to interpret the
-   input stream is <span>a UTF-16 encoding</span>, then set the <span
-   title="concept-encoding-confidence">confidence</span> to
-   <i>certain</i> and abort these steps. The new encoding is ignored;
-   if it was anything but the same encoding, then it would be clearly
-   incorrect.</li>
-
-   <li>If the new encoding is <span>a UTF-16 encoding</span>, change
-   it to UTF-8.</li>
-
-   <li>If the new encoding is identical or equivalent to the encoding
-   that is already being used to interpret the input stream, then set
-   the <span title="concept-encoding-confidence">confidence</span> to
-   <i>certain</i> and abort these steps. This happens when the
-   encoding information found in the file matches what the
-   <span>encoding sniffing algorithm</span> determined to be the
-   encoding, and in the second pass through the parser if the first
-   pass found that the encoding sniffing algorithm described in the
-   earlier section failed to find the right encoding.</li>
-
-   <li>If all the bytes up to the last byte converted by the current
-   decoder have the same Unicode interpretations in both the current
-   encoding and the new encoding, and if the user agent supports
-   changing the converter on the fly, then the user agent may change
-   to the new converter for the encoding on the fly. Set the
-   <span>document's character encoding</span> and the encoding used to
-   convert the input stream to the new encoding, set the <span
-   title="concept-encoding-confidence">confidence</span> to
-   <i>certain</i>, and abort these steps.</li>
-
-   <li>Otherwise, <span>navigate</span><!--DONAV reparse--> to the
-   document again, with <span>replacement enabled</span>, and using
-   the same <span>source browsing context</span>, but this time skip
-   the <span>encoding sniffing algorithm</span> and instead just set
-   the encoding to the new encoding and the <span
-   title="concept-encoding-confidence">confidence</span> to
-   <i>certain</i>. Whenever possible, this should be done without
-   actually contacting the network layer (the bytes should be
-   re-parsed from memory), even if, e.g., the document is marked as
-   not being cacheable. If this is not possible and contacting the
-   network layer would involve repeating a request that uses a method
-   other than HTTP GET (<span title="concept-http-equivalent-get">or
-   equivalent</span> for non-HTTP URLs), then instead set the <span
-   title="concept-encoding-confidence">confidence</span> to
-   <i>certain</i> and ignore the new encoding. The resource will be
-   misinterpreted. User agents may notify the user of the situation,
-   to aid in application development.</li>
-
-  </ol>
-
   </div>
 
 




More information about the Commit-Watchers mailing list