[html5] r5042 - [e] (0) Move the Content-Type encoding parsing hack of an algorithm back into HT [...]
whatwg at whatwg.org
whatwg at whatwg.org
Tue Apr 13 20:06:55 PDT 2010
Author: ianh
Date: 2010-04-13 20:06:54 -0700 (Tue, 13 Apr 2010)
New Revision: 5042
Modified:
complete.html
index
source
Log:
[e] (0) Move the Content-Type encoding parsing hack of an algorithm back into HTML5 from MIMESNIFF.
Modified: complete.html
===================================================================
--- complete.html 2010-04-13 22:57:01 UTC (rev 5041)
+++ complete.html 2010-04-14 03:06:54 UTC (rev 5042)
@@ -186,7 +186,7 @@
<header class=head id=head><p><a class=logo href=http://www.whatwg.org/ rel=home><img alt=WHATWG src=/images/logo></a></p>
<hgroup><h1>Web Applications 1.0</h1>
- <h2 class="no-num no-toc">Draft Standard — 13 April 2010</h2>
+ <h2 class="no-num no-toc">Draft Standard — 14 April 2010</h2>
</hgroup><p>You can take part in this work. <a href=http://www.whatwg.org/mailing-list>Join the working group's discussion list.</a></p>
<p><strong>Web designers!</strong> We have a <a href=http://blog.whatwg.org/faq/>FAQ</a>, a <a href=http://forums.whatwg.org/>forum</a>, and a <a href=http://www.whatwg.org/mailing-list#help>help mailing list</a> for you!</p>
<!--<p class="impl"><strong>Implementors!</strong> We have a <a href="http://www.whatwg.org/mailing-list#implementors">mailing list</a> for you too!</p>-->
@@ -6368,12 +6368,6 @@
with the requirements of the Content-Type Processing Model
specification. <a href=#refsMIMESNIFF>[MIMESNIFF]</a></p>
- <p>The <dfn id=algorithm-for-extracting-an-encoding-from-a-content-type>algorithm for extracting an encoding from a
- Content-Type</dfn>, given a string <var title="">s</var>, is given
- in the Content-Type Processing Model specification. It either
- returns an encoding or nothing. <a href=#refsMIMESNIFF>[MIMESNIFF]</a></p>
- <p class=XXX>The above is out of date now that the relevant section has been removed from MIMESNIFF. Stay tuned; I'll bring it back here soon.</p>
-
<p>The <dfn id=content-type-sniffing-0 title="Content-Type sniffing">sniffed type of a
resource</dfn> must be found in a manner consistent with the
requirements given in the Content-Type Processing Model
@@ -6394,6 +6388,50 @@
occur. For more details, see the Content-Type Processing Model
specification. <a href=#refsMIMESNIFF>[MIMESNIFF]</a></p>
+ <p>The <dfn id=algorithm-for-extracting-an-encoding-from-a-content-type>algorithm for extracting an encoding from a
+ Content-Type</dfn>, given a string <var title="">s</var>, is as
+ follows. It either returns an encoding or nothing.</p>
+
+ <ol><li><p>Find the first seven characters in <var title="">s</var>
+ that are an <a href=#ascii-case-insensitive>ASCII case-insensitive</a> match for the word
+ "<code title="">charset</code>". If no such match is found, return
+ nothing.</li>
+
+ <li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020
+ characters that immediately follow the word "<code title="">charset</code>" (there might not be any).</li>
+
+ <li><p>If the next character is not a U+003D EQUALS SIGN ('='),
+ return nothing and abort these steps.</li>
+
+ <li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020
+ characters that immediately follow the equals sign (there might not
+ be any).</li>
+
+ <li>
+
+ <p>Process the next character as follows:</p>
+
+ <dl class=switch><dt>If it is a U+0022 QUOTATION MARK ('"') and there is a later U+0022 QUOTATION MARK ('"') in <var title="">s</var></dt>
+ <dt>If it is a U+0027 APOSTROPHE ("'") and there is a later U+0027 APOSTROPHE ("'") in <var title="">s</var></dt>
+ <dd>Return the encoding corresponding to the string between this character and the next earliest occurrence of this character.</dd>
+
+ <dt>If it is an unmatched U+0022 QUOTATION MARK ('"')</dt>
+ <dt>If it is an unmatched U+0027 APOSTROPHE ("'")</dt>
+ <dt>If there is no next character</dt>
+ <dd>Return nothing.</dd>
+
+ <dt>Otherwise</dt>
+ <dd>Return the encoding corresponding to the string from this
+ character to the first U+0009, U+000A, U+000C, U+000D, U+0020, or
+ U+003B character or the end of <var title="">s</var>, whichever
+ comes first.</dd>
+
+ </dl></li>
+
+ </ol><p class=note>This requirement is a <a href=#willful-violation>willful violation</a>
+ of the HTTP specification, motivated by the need for backwards
+ compatibility with legacy content. <a href=#refsHTTP>[HTTP]</a></p>
+
</div>
Modified: index
===================================================================
--- index 2010-04-13 22:57:01 UTC (rev 5041)
+++ index 2010-04-14 03:06:54 UTC (rev 5042)
@@ -190,7 +190,7 @@
<header class=head id=head><p><a class=logo href=http://www.whatwg.org/ rel=home><img alt=WHATWG src=/images/logo></a></p>
<hgroup><h1>HTML5 (including next generation additions still in development)</h1>
- <h2 class="no-num no-toc">Draft Standard — 13 April 2010</h2>
+ <h2 class="no-num no-toc">Draft Standard — 14 April 2010</h2>
</hgroup><p>You can take part in this work. <a href=http://www.whatwg.org/mailing-list>Join the working group's discussion list.</a></p>
<p><strong>Web designers!</strong> We have a <a href=http://blog.whatwg.org/faq/>FAQ</a>, a <a href=http://forums.whatwg.org/>forum</a>, and a <a href=http://www.whatwg.org/mailing-list#help>help mailing list</a> for you!</p>
<!--<p class="impl"><strong>Implementors!</strong> We have a <a href="http://www.whatwg.org/mailing-list#implementors">mailing list</a> for you too!</p>-->
@@ -6266,12 +6266,6 @@
with the requirements of the Content-Type Processing Model
specification. <a href=#refsMIMESNIFF>[MIMESNIFF]</a></p>
- <p>The <dfn id=algorithm-for-extracting-an-encoding-from-a-content-type>algorithm for extracting an encoding from a
- Content-Type</dfn>, given a string <var title="">s</var>, is given
- in the Content-Type Processing Model specification. It either
- returns an encoding or nothing. <a href=#refsMIMESNIFF>[MIMESNIFF]</a></p>
- <p class=XXX>The above is out of date now that the relevant section has been removed from MIMESNIFF. Stay tuned; I'll bring it back here soon.</p>
-
<p>The <dfn id=content-type-sniffing-0 title="Content-Type sniffing">sniffed type of a
resource</dfn> must be found in a manner consistent with the
requirements given in the Content-Type Processing Model
@@ -6292,6 +6286,50 @@
occur. For more details, see the Content-Type Processing Model
specification. <a href=#refsMIMESNIFF>[MIMESNIFF]</a></p>
+ <p>The <dfn id=algorithm-for-extracting-an-encoding-from-a-content-type>algorithm for extracting an encoding from a
+ Content-Type</dfn>, given a string <var title="">s</var>, is as
+ follows. It either returns an encoding or nothing.</p>
+
+ <ol><li><p>Find the first seven characters in <var title="">s</var>
+ that are an <a href=#ascii-case-insensitive>ASCII case-insensitive</a> match for the word
+ "<code title="">charset</code>". If no such match is found, return
+ nothing.</li>
+
+ <li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020
+ characters that immediately follow the word "<code title="">charset</code>" (there might not be any).</li>
+
+ <li><p>If the next character is not a U+003D EQUALS SIGN ('='),
+ return nothing and abort these steps.</li>
+
+ <li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020
+ characters that immediately follow the equals sign (there might not
+ be any).</li>
+
+ <li>
+
+ <p>Process the next character as follows:</p>
+
+ <dl class=switch><dt>If it is a U+0022 QUOTATION MARK ('"') and there is a later U+0022 QUOTATION MARK ('"') in <var title="">s</var></dt>
+ <dt>If it is a U+0027 APOSTROPHE ("'") and there is a later U+0027 APOSTROPHE ("'") in <var title="">s</var></dt>
+ <dd>Return the encoding corresponding to the string between this character and the next earliest occurrence of this character.</dd>
+
+ <dt>If it is an unmatched U+0022 QUOTATION MARK ('"')</dt>
+ <dt>If it is an unmatched U+0027 APOSTROPHE ("'")</dt>
+ <dt>If there is no next character</dt>
+ <dd>Return nothing.</dd>
+
+ <dt>Otherwise</dt>
+ <dd>Return the encoding corresponding to the string from this
+ character to the first U+0009, U+000A, U+000C, U+000D, U+0020, or
+ U+003B character or the end of <var title="">s</var>, whichever
+ comes first.</dd>
+
+ </dl></li>
+
+ </ol><p class=note>This requirement is a <a href=#willful-violation>willful violation</a>
+ of the HTTP specification, motivated by the need for backwards
+ compatibility with legacy content. <a href=#refsHTTP>[HTTP]</a></p>
+
</div>
Modified: source
===================================================================
--- source 2010-04-13 22:57:01 UTC (rev 5041)
+++ source 2010-04-14 03:06:54 UTC (rev 5042)
@@ -5954,13 +5954,6 @@
with the requirements of the Content-Type Processing Model
specification. <a href="#refsMIMESNIFF">[MIMESNIFF]</a></p>
- <p>The <dfn>algorithm for extracting an encoding from a
- Content-Type</dfn>, given a string <var title="">s</var>, is given
- in the Content-Type Processing Model specification. It either
- returns an encoding or nothing. <a
- href="#refsMIMESNIFF">[MIMESNIFF]</a></p>
- <p class="XXX">The above is out of date now that the relevant section has been removed from MIMESNIFF. Stay tuned; I'll bring it back here soon.</p>
-
<p>The <dfn title="Content-Type sniffing">sniffed type of a
resource</dfn> must be found in a manner consistent with the
requirements given in the Content-Type Processing Model
@@ -5981,6 +5974,60 @@
occur. For more details, see the Content-Type Processing Model
specification. <a href="#refsMIMESNIFF">[MIMESNIFF]</a></p>
+ <p>The <dfn>algorithm for extracting an encoding from a
+ Content-Type</dfn>, given a string <var title="">s</var>, is as
+ follows. It either returns an encoding or nothing.</p>
+
+ <ol>
+
+ <li><p>Find the first seven characters in <var title="">s</var>
+ that are an <span>ASCII case-insensitive</span> match for the word
+ "<code title="">charset</code>". If no such match is found, return
+ nothing.</p></li>
+
+ <li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020
+ characters that immediately follow the word "<code
+ title="">charset</code>" (there might not be any).</p></li>
+
+ <li><p>If the next character is not a U+003D EQUALS SIGN ('='),
+ return nothing and abort these steps.</p></li>
+
+ <li><p>Skip any U+0009, U+000A, U+000C, U+000D, or U+0020
+ characters that immediately follow the equals sign (there might not
+ be any).</p></li>
+
+ <li>
+
+ <p>Process the next character as follows:</p>
+
+ <dl class="switch">
+
+ <dt>If it is a U+0022 QUOTATION MARK ('"') and there is a later U+0022 QUOTATION MARK ('"') in <var title="">s</var></dt>
+ <dt>If it is a U+0027 APOSTROPHE ("'") and there is a later U+0027 APOSTROPHE ("'") in <var title="">s</var></dt>
+ <dd>Return the encoding corresponding to the string between this character and the next earliest occurrence of this character.</dd>
+
+ <dt>If it is an unmatched U+0022 QUOTATION MARK ('"')</dt>
+ <dt>If it is an unmatched U+0027 APOSTROPHE ("'")</dt>
+ <dt>If there is no next character</dt>
+ <dd>Return nothing.</dd>
+
+ <dt>Otherwise</dt>
+ <dd>Return the encoding corresponding to the string from this
+ character to the first U+0009, U+000A, U+000C, U+000D, U+0020, or
+ U+003B character or the end of <var title="">s</var>, whichever
+ comes first.</dd>
+
+ </dl>
+
+ </li>
+
+ </ol>
+
+ <p class="note">This requirement is a <span>willful violation</span>
+ of the HTTP specification, motivated by the need for backwards
+ compatibility with legacy content. <a
+ href="#refsHTTP">[HTTP]</a></p>
+
</div>
More information about the Commit-Watchers
mailing list