[html5] r5042 - [e] (0) Move the Content-Type encoding parsing hack of an algorithm back into HT [...]

whatwg at whatwg.org whatwg at whatwg.org
Tue Apr 13 20:06:55 PDT 2010


Author: ianh
Date: 2010-04-13 20:06:54 -0700 (Tue, 13 Apr 2010)
New Revision: 5042

Modified:
   complete.html
   index
   source
Log:
[e] (0) Move the Content-Type encoding parsing hack of an algorithm back into HTML5 from MIMESNIFF.

Modified: complete.html
===================================================================
--- complete.html	2010-04-13 22:57:01 UTC (rev 5041)
+++ complete.html	2010-04-14 03:06:54 UTC (rev 5042)
@@ -186,7 +186,7 @@
 
   <header class=head id=head><p><a class=logo href=http://www.whatwg.org/ rel=home><img alt=WHATWG src=/images/logo></a></p>
    <hgroup><h1>Web Applications 1.0</h1>
-    <h2 class="no-num no-toc">Draft Standard — 13 April 2010</h2>
+    <h2 class="no-num no-toc">Draft Standard — 14 April 2010</h2>
    </hgroup><p>You can take part in this work. <a href=http://www.whatwg.org/mailing-list>Join the working group's discussion list.</a></p>
    <p><strong>Web designers!</strong> We have a <a href=http://blog.whatwg.org/faq/>FAQ</a>, a <a href=http://forums.whatwg.org/>forum</a>, and a <a href=http://www.whatwg.org/mailing-list#help>help mailing list</a> for you!</p>
    <!--<p class="impl"><strong>Implementors!</strong> We have a <a href="http://www.whatwg.org/mailing-list#implementors">mailing list</a> for you too!</p>-->
@@ -6368,12 +6368,6 @@
   with the requirements of the Content-Type Processing Model
   specification. <a href=#refsMIMESNIFF>[MIMESNIFF]</a></p>
 
-  <p>The <dfn id=algorithm-for-extracting-an-encoding-from-a-content-type>algorithm for extracting an encoding from a
-  Content-Type</dfn>, given a string <var title="">s</var>, is given
-  in the Content-Type Processing Model specification. It either
-  returns an encoding or nothing. <a href=#refsMIMESNIFF>[MIMESNIFF]</a></p>
-  <p class=XXX>The above is out of date now that the relevant section has been removed from MIMESNIFF. Stay tuned; I'll bring it back here soon.</p>
-
   <p>The <dfn id=content-type-sniffing-0 title="Content-Type sniffing">sniffed type of a
   resource</dfn> must be found in a manner consistent with the
   requirements given in the Content-Type Processing Model
@@ -6394,6 +6388,50 @@
   occur. For more details, see the Content-Type Processing Model
   specification. <a href=#refsMIMESNIFF>[MIMESNIFF]</a></p>
 
+  <p>The <dfn id=algorithm-for-extracting-an-encoding-from-a-content-type>algorithm for extracting an encoding from a
+  Content-Type</dfn>, given a string <var title="">s</var>, is as
+  follows. It either returns an encoding or nothing.</p>
+
+  <ol><li><p>Find the first seven characters in <var title="">s</var>
+   that are an <a href=#ascii-case-insensitive>ASCII case-insensitive</a> match for the word
+   "<code title="">charset</code>".  If no such match is found, return
+   nothing.</li>
+
+   <li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020
+   characters that immediately follow the word "<code title="">charset</code>" (there might not be any).</li>
+
+   <li><p>If the next character is not a U+003D EQUALS SIGN ('='),
+   return nothing and abort these steps.</li>
+
+   <li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020
+   characters that immediately follow the equals sign (there might not
+   be any).</li>
+
+   <li>
+
+    <p>Process the next character as follows:</p>
+
+    <dl class=switch><dt>If it is a U+0022 QUOTATION MARK ('"') and there is a later U+0022 QUOTATION MARK ('"') in <var title="">s</var></dt>
+     <dt>If it is a U+0027 APOSTROPHE ("'") and there is a later U+0027 APOSTROPHE ("'") in <var title="">s</var></dt>
+     <dd>Return the encoding corresponding to the string between this character and the next earliest occurrence of this character.</dd>
+
+     <dt>If it is an unmatched U+0022 QUOTATION MARK ('"')</dt>
+     <dt>If it is an unmatched U+0027 APOSTROPHE ("'")</dt>
+     <dt>If there is no next character</dt>
+     <dd>Return nothing.</dd>
+
+     <dt>Otherwise</dt>
+     <dd>Return the encoding corresponding to the string from this
+     character to the first U+0009, U+000A, U+000C, U+000D, U+0020, or
+     U+003B character or the end of <var title="">s</var>, whichever
+     comes first.</dd>
+
+    </dl></li>
+
+  </ol><p class=note>This requirement is a <a href=#willful-violation>willful violation</a>
+  of the HTTP specification, motivated by the need for backwards
+  compatibility with legacy content. <a href=#refsHTTP>[HTTP]</a></p>
+
   </div>
 
 

Modified: index
===================================================================
--- index	2010-04-13 22:57:01 UTC (rev 5041)
+++ index	2010-04-14 03:06:54 UTC (rev 5042)
@@ -190,7 +190,7 @@
 
   <header class=head id=head><p><a class=logo href=http://www.whatwg.org/ rel=home><img alt=WHATWG src=/images/logo></a></p>
    <hgroup><h1>HTML5 (including next generation additions still in development)</h1>
-    <h2 class="no-num no-toc">Draft Standard — 13 April 2010</h2>
+    <h2 class="no-num no-toc">Draft Standard — 14 April 2010</h2>
    </hgroup><p>You can take part in this work. <a href=http://www.whatwg.org/mailing-list>Join the working group's discussion list.</a></p>
    <p><strong>Web designers!</strong> We have a <a href=http://blog.whatwg.org/faq/>FAQ</a>, a <a href=http://forums.whatwg.org/>forum</a>, and a <a href=http://www.whatwg.org/mailing-list#help>help mailing list</a> for you!</p>
    <!--<p class="impl"><strong>Implementors!</strong> We have a <a href="http://www.whatwg.org/mailing-list#implementors">mailing list</a> for you too!</p>-->
@@ -6266,12 +6266,6 @@
   with the requirements of the Content-Type Processing Model
   specification. <a href=#refsMIMESNIFF>[MIMESNIFF]</a></p>
 
-  <p>The <dfn id=algorithm-for-extracting-an-encoding-from-a-content-type>algorithm for extracting an encoding from a
-  Content-Type</dfn>, given a string <var title="">s</var>, is given
-  in the Content-Type Processing Model specification. It either
-  returns an encoding or nothing. <a href=#refsMIMESNIFF>[MIMESNIFF]</a></p>
-  <p class=XXX>The above is out of date now that the relevant section has been removed from MIMESNIFF. Stay tuned; I'll bring it back here soon.</p>
-
   <p>The <dfn id=content-type-sniffing-0 title="Content-Type sniffing">sniffed type of a
   resource</dfn> must be found in a manner consistent with the
   requirements given in the Content-Type Processing Model
@@ -6292,6 +6286,50 @@
   occur. For more details, see the Content-Type Processing Model
   specification. <a href=#refsMIMESNIFF>[MIMESNIFF]</a></p>
 
+  <p>The <dfn id=algorithm-for-extracting-an-encoding-from-a-content-type>algorithm for extracting an encoding from a
+  Content-Type</dfn>, given a string <var title="">s</var>, is as
+  follows. It either returns an encoding or nothing.</p>
+
+  <ol><li><p>Find the first seven characters in <var title="">s</var>
+   that are an <a href=#ascii-case-insensitive>ASCII case-insensitive</a> match for the word
+   "<code title="">charset</code>".  If no such match is found, return
+   nothing.</li>
+
+   <li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020
+   characters that immediately follow the word "<code title="">charset</code>" (there might not be any).</li>
+
+   <li><p>If the next character is not a U+003D EQUALS SIGN ('='),
+   return nothing and abort these steps.</li>
+
+   <li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020
+   characters that immediately follow the equals sign (there might not
+   be any).</li>
+
+   <li>
+
+    <p>Process the next character as follows:</p>
+
+    <dl class=switch><dt>If it is a U+0022 QUOTATION MARK ('"') and there is a later U+0022 QUOTATION MARK ('"') in <var title="">s</var></dt>
+     <dt>If it is a U+0027 APOSTROPHE ("'") and there is a later U+0027 APOSTROPHE ("'") in <var title="">s</var></dt>
+     <dd>Return the encoding corresponding to the string between this character and the next earliest occurrence of this character.</dd>
+
+     <dt>If it is an unmatched U+0022 QUOTATION MARK ('"')</dt>
+     <dt>If it is an unmatched U+0027 APOSTROPHE ("'")</dt>
+     <dt>If there is no next character</dt>
+     <dd>Return nothing.</dd>
+
+     <dt>Otherwise</dt>
+     <dd>Return the encoding corresponding to the string from this
+     character to the first U+0009, U+000A, U+000C, U+000D, U+0020, or
+     U+003B character or the end of <var title="">s</var>, whichever
+     comes first.</dd>
+
+    </dl></li>
+
+  </ol><p class=note>This requirement is a <a href=#willful-violation>willful violation</a>
+  of the HTTP specification, motivated by the need for backwards
+  compatibility with legacy content. <a href=#refsHTTP>[HTTP]</a></p>
+
   </div>
 
 

Modified: source
===================================================================
--- source	2010-04-13 22:57:01 UTC (rev 5041)
+++ source	2010-04-14 03:06:54 UTC (rev 5042)
@@ -5954,13 +5954,6 @@
   with the requirements of the Content-Type Processing Model
   specification. <a href="#refsMIMESNIFF">[MIMESNIFF]</a></p>
 
-  <p>The <dfn>algorithm for extracting an encoding from a
-  Content-Type</dfn>, given a string <var title="">s</var>, is given
-  in the Content-Type Processing Model specification. It either
-  returns an encoding or nothing. <a
-  href="#refsMIMESNIFF">[MIMESNIFF]</a></p>
-  <p class="XXX">The above is out of date now that the relevant section has been removed from MIMESNIFF. Stay tuned; I'll bring it back here soon.</p>
-
   <p>The <dfn title="Content-Type sniffing">sniffed type of a
   resource</dfn> must be found in a manner consistent with the
   requirements given in the Content-Type Processing Model
@@ -5981,6 +5974,60 @@
   occur. For more details, see the Content-Type Processing Model
   specification. <a href="#refsMIMESNIFF">[MIMESNIFF]</a></p>
 
+  <p>The <dfn>algorithm for extracting an encoding from a
+  Content-Type</dfn>, given a string <var title="">s</var>, is as
+  follows. It either returns an encoding or nothing.</p>
+
+  <ol>
+
+   <li><p>Find the first seven characters in <var title="">s</var>
+   that are an <span>ASCII case-insensitive</span> match for the word
+   "<code title="">charset</code>".  If no such match is found, return
+   nothing.</p></li>
+
+   <li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020
+   characters that immediately follow the word "<code
+   title="">charset</code>" (there might not be any).</p></li>
+
+   <li><p>If the next character is not a U+003D EQUALS SIGN ('='),
+   return nothing and abort these steps.</p></li>
+
+   <li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020
+   characters that immediately follow the equals sign (there might not
+   be any).</p></li>
+
+   <li>
+
+    <p>Process the next character as follows:</p>
+
+    <dl class="switch">
+
+     <dt>If it is a U+0022 QUOTATION MARK ('"') and there is a later U+0022 QUOTATION MARK ('"') in <var title="">s</var></dt>
+     <dt>If it is a U+0027 APOSTROPHE ("'") and there is a later U+0027 APOSTROPHE ("'") in <var title="">s</var></dt>
+     <dd>Return the encoding corresponding to the string between this character and the next earliest occurrence of this character.</dd>
+
+     <dt>If it is an unmatched U+0022 QUOTATION MARK ('"')</dt>
+     <dt>If it is an unmatched U+0027 APOSTROPHE ("'")</dt>
+     <dt>If there is no next character</dt>
+     <dd>Return nothing.</dd>
+
+     <dt>Otherwise</dt>
+     <dd>Return the encoding corresponding to the string from this
+     character to the first U+0009, U+000A, U+000C, U+000D, U+0020, or
+     U+003B character or the end of <var title="">s</var>, whichever
+     comes first.</dd>
+
+    </dl>
+
+   </li>
+
+  </ol>
+
+  <p class="note">This requirement is a <span>willful violation</span>
+  of the HTTP specification, motivated by the need for backwards
+  compatibility with legacy content. <a
+  href="#refsHTTP">[HTTP]</a></p>
+
   </div>
 
 




More information about the Commit-Watchers mailing list