[html5] r5600 - [giow] (2) Redefine how we interact with RFC 2388 (multipart/form-data) in submi [...]

whatwg at whatwg.org whatwg at whatwg.org
Mon Oct 11 17:10:21 PDT 2010


Author: ianh
Date: 2010-10-11 17:10:17 -0700 (Mon, 11 Oct 2010)
New Revision: 5600

Modified:
   complete.html
   index
   source
Log:
[giow] (2) Redefine how we interact with RFC 2388 (multipart/form-data) in submission
Fixing http://www.w3.org/Bugs/Public/show_bug.cgi?id=10461

Modified: complete.html
===================================================================
--- complete.html	2010-10-11 22:32:31 UTC (rev 5599)
+++ complete.html	2010-10-12 00:10:17 UTC (rev 5600)
@@ -214,7 +214,7 @@
 
   <header class=head id=head><p><a class=logo href=http://www.whatwg.org/ rel=home><img alt=WHATWG height=101 src=/images/logo width=101></a></p>
    <hgroup><h1>Web Applications 1.0</h1>
-    <h2 class="no-num no-toc">Draft Standard — 11 October 2010</h2>
+    <h2 class="no-num no-toc">Draft Standard — 12 October 2010</h2>
    </hgroup><p>You can take part in this work. <a href=http://www.whatwg.org/mailing-list>Join the working group's discussion list.</a></p>
    <p><strong>Web designers!</strong> We have a <a href=http://blog.whatwg.org/faq/>FAQ</a>, a <a href=http://forums.whatwg.org/>forum</a>, and a <a href=http://www.whatwg.org/mailing-list#help>help mailing list</a> for you!</p>
    <!--<p class="impl"><strong>Implementors!</strong> We have a <a href="http://www.whatwg.org/mailing-list#implementors">mailing list</a> for you too!</p>-->
@@ -47018,7 +47018,7 @@
   <p>The <dfn id=application/x-www-form-urlencoded-encoding-algorithm><code title="">application/x-www-form-urlencoded</code> encoding
   algorithm</dfn> is as follows:</p>
 
-  <ol><li><p>Let <var title="">result</var> be the empty string.</li>
+  <ol><!-- the first few steps of this are the same as in the next section --><li><p>Let <var title="">result</var> be the empty string.</li>
 
    <li>
 
@@ -47051,7 +47051,8 @@
      with <var title="">charset</var>.</li>
 
      <li><p>If the entry's type is "<code title="">file</code>",
-     replace its value with the file's filename only.</li>
+     replace its value with the file's filename only.</li> <!--
+     this is not present in the next section -->
 
      <li><p>For each character in the entry's name and value that
      cannot be expressed using the selected character encoding,
@@ -47061,9 +47062,11 @@
      U+0039 DIGIT NINE (9) representing the Unicode code point of the
      character in base ten, and finally a U+003B SEMICOLON character
      (;).</li><!-- we should say it should be the shortest
-     possible string, no leading zeros. this whole step as asinine,
+     possible string, no leading zeros. this whole step is asinine,
      though, so... -->
 
+     <!-- this is where the similarities with the next section end -->
+
      <li>
 
       <p>For each character in the entry's name and value, apply the
@@ -47147,27 +47150,89 @@
 
   <h5 id=multipart-form-data><span class=secno>4.10.21.5 </span>Multipart form data</h5>
 
+  <!-- http://hixie.ch/tests/adhoc/html/forms/submission/multipart_form-data/ -->
+
   <p>The <dfn id=multipart/form-data-encoding-algorithm><code title="">multipart/form-data</code> encoding
-  algorithm</dfn> is to encode the <var title="">form data set</var>
-  using the rules described by RFC2388, <cite>Returning Values from
-  Forms: <code title="">multipart/form-data</code></cite>, and return
-  the resulting byte stream. <a href=#refsRFC2388>[RFC2388]</a></p>
+  algorithm</dfn> is as follows:</p>
 
-  <p>Each entry in the <var title="">form data set</var> is a
-  <i>field</i>, the name of the entry is the <i>field name</i> and the
-  value of the entry is the <i>field value</i>, unless the entry's
-  name is "<code title=attr-fe-name-charset><a href=#attr-fe-name-charset>_charset_</a></code>" and its type is "<code title="">hidden</code>", in which case the <i>field value</i> is the
-  character encoding used by the aforementioned algorithm to encode
-  the value of the field.</p>
+  <ol><!-- the first few steps of this are the same as in the previous section --><li><p>Let <var title="">result</var> be the empty string.</li>
 
-  <p>The order of parts must be the same as the order of fields in the
-  <var title="">form data set</var>. Multiple entries with the same
-  name must be treated as distinct fields.</p>
+   <li>
 
-  </div>
+    <p>If the <code><a href=#the-form-element>form</a></code> element has an <code title=attr-form-accept-charset><a href=#attr-form-accept-charset>accept-charset</a></code> attribute,
+    then, taking into account the characters found in the <var title="">form data set</var>'s names and values, and the character
+    encodings supported by the user agent, select a character encoding
+    from the list given in the <code><a href=#the-form-element>form</a></code>'s <code title=attr-form-accept-charset><a href=#attr-form-accept-charset>accept-charset</a></code> attribute
+    that is an <a href=#ascii-compatible-character-encoding>ASCII-compatible character encoding</a>. If
+    none of the encodings are supported, or if none are listed, then
+    let the selected character encoding be UTF-8.</p>
 
+    <p>Otherwise, if the <a href="#document's-character-encoding">document's character encoding</a> is
+    an <a href=#ascii-compatible-character-encoding>ASCII-compatible character encoding</a>, then that is
+    the selected character encoding.</p>
 
+    <p>Otherwise, let the selected character encoding be UTF-8.</p>
 
+   </li>
+
+   <li><p>Let <var title="">charset</var> be the <a href=#preferred-mime-name>preferred MIME
+   name</a> of the selected character encoding.</li>
+
+   <li>
+
+    <p>For each entry in the <var title="">form data set</var>,
+    perform these substeps:</p>
+
+    <ol><li><p>If the entry's name is "<code title=attr-fe-name-charset><a href=#attr-fe-name-charset>_charset_</a></code>"
+     and its type is "<code title="">hidden</code>", replace its value
+     with <var title="">charset</var>.</li>
+
+     <!-- the step that replaces a file with its name is missing in
+     this version of the algorithm -->
+
+     <li><p>For each character in the entry's name and value that
+     cannot be expressed using the selected character encoding,
+     replace the character by a string consisting of a U+0026
+     AMPERSAND character (&), a U+0023 NUMBER SIGN character (#),
+     one or more characters in the range U+0030 DIGIT ZERO (0) to
+     U+0039 DIGIT NINE (9) representing the Unicode code point of the
+     character in base ten, and finally a U+003B SEMICOLON character
+     (;).</li><!-- we should say it should be the shortest
+     possible string, no leading zeros. this whole step is asinine,
+     though, so... -->
+
+     <!-- this is where the similarities with the previous section end -->
+
+    </ol></li>
+
+   <li>
+
+    <p>Encode the (now mutated) <var title="">form data set</var>
+    using the rules described by RFC 2388, <cite>Returning Values from
+    Forms: <code title="">multipart/form-data</code></cite>, and
+    return the resulting byte stream. <a href=#refsRFC2388>[RFC2388]</a></p>
+
+    <p>Each entry in the <var title="">form data set</var> is a
+    <i>field</i>, the name of the entry is the <i>field name</i> and
+    the value of the entry is the <i>field value</i>.</p>
+
+    <p>The order of parts must be the same as the order of fields in
+    the <var title="">form data set</var>. Multiple entries with the
+    same name must be treated as distinct fields.</p>
+
+    <p>The parts of the generated <code title="">multipart/form-data</code> resource that correspond to
+    non-file fields must not have a <code><a href=#content-type>Content-Type</a></code> header
+    specified. Their names and values must be encoded using the
+    character encoding selected above (field names in particular do
+    not get converted to a 7-bit safe encoding as suggested in RFC
+    2388).</p>
+
+   </li>
+
+  </ol></div>
+
+
+
   <div class=impl>
 
   <h5 id=plain-text-form-data><span class=secno>4.10.21.6 </span>Plain text form data</h5>

Modified: index
===================================================================
--- index	2010-10-11 22:32:31 UTC (rev 5599)
+++ index	2010-10-12 00:10:17 UTC (rev 5600)
@@ -218,7 +218,7 @@
 
   <header class=head id=head><p><a class=logo href=http://www.whatwg.org/ rel=home><img alt=WHATWG height=101 src=/images/logo width=101></a></p>
    <hgroup><h1>HTML5 (including next generation additions still in development)</h1>
-    <h2 class="no-num no-toc">Draft Standard — 11 October 2010</h2>
+    <h2 class="no-num no-toc">Draft Standard — 12 October 2010</h2>
    </hgroup><p>You can take part in this work. <a href=http://www.whatwg.org/mailing-list>Join the working group's discussion list.</a></p>
    <p><strong>Web designers!</strong> We have a <a href=http://blog.whatwg.org/faq/>FAQ</a>, a <a href=http://forums.whatwg.org/>forum</a>, and a <a href=http://www.whatwg.org/mailing-list#help>help mailing list</a> for you!</p>
    <!--<p class="impl"><strong>Implementors!</strong> We have a <a href="http://www.whatwg.org/mailing-list#implementors">mailing list</a> for you too!</p>-->
@@ -46998,7 +46998,7 @@
   <p>The <dfn id=application/x-www-form-urlencoded-encoding-algorithm><code title="">application/x-www-form-urlencoded</code> encoding
   algorithm</dfn> is as follows:</p>
 
-  <ol><li><p>Let <var title="">result</var> be the empty string.</li>
+  <ol><!-- the first few steps of this are the same as in the next section --><li><p>Let <var title="">result</var> be the empty string.</li>
 
    <li>
 
@@ -47031,7 +47031,8 @@
      with <var title="">charset</var>.</li>
 
      <li><p>If the entry's type is "<code title="">file</code>",
-     replace its value with the file's filename only.</li>
+     replace its value with the file's filename only.</li> <!--
+     this is not present in the next section -->
 
      <li><p>For each character in the entry's name and value that
      cannot be expressed using the selected character encoding,
@@ -47041,9 +47042,11 @@
      U+0039 DIGIT NINE (9) representing the Unicode code point of the
      character in base ten, and finally a U+003B SEMICOLON character
      (;).</li><!-- we should say it should be the shortest
-     possible string, no leading zeros. this whole step as asinine,
+     possible string, no leading zeros. this whole step is asinine,
      though, so... -->
 
+     <!-- this is where the similarities with the next section end -->
+
      <li>
 
       <p>For each character in the entry's name and value, apply the
@@ -47127,27 +47130,89 @@
 
   <h5 id=multipart-form-data><span class=secno>4.10.21.5 </span>Multipart form data</h5>
 
+  <!-- http://hixie.ch/tests/adhoc/html/forms/submission/multipart_form-data/ -->
+
   <p>The <dfn id=multipart/form-data-encoding-algorithm><code title="">multipart/form-data</code> encoding
-  algorithm</dfn> is to encode the <var title="">form data set</var>
-  using the rules described by RFC2388, <cite>Returning Values from
-  Forms: <code title="">multipart/form-data</code></cite>, and return
-  the resulting byte stream. <a href=#refsRFC2388>[RFC2388]</a></p>
+  algorithm</dfn> is as follows:</p>
 
-  <p>Each entry in the <var title="">form data set</var> is a
-  <i>field</i>, the name of the entry is the <i>field name</i> and the
-  value of the entry is the <i>field value</i>, unless the entry's
-  name is "<code title=attr-fe-name-charset><a href=#attr-fe-name-charset>_charset_</a></code>" and its type is "<code title="">hidden</code>", in which case the <i>field value</i> is the
-  character encoding used by the aforementioned algorithm to encode
-  the value of the field.</p>
+  <ol><!-- the first few steps of this are the same as in the previous section --><li><p>Let <var title="">result</var> be the empty string.</li>
 
-  <p>The order of parts must be the same as the order of fields in the
-  <var title="">form data set</var>. Multiple entries with the same
-  name must be treated as distinct fields.</p>
+   <li>
 
-  </div>
+    <p>If the <code><a href=#the-form-element>form</a></code> element has an <code title=attr-form-accept-charset><a href=#attr-form-accept-charset>accept-charset</a></code> attribute,
+    then, taking into account the characters found in the <var title="">form data set</var>'s names and values, and the character
+    encodings supported by the user agent, select a character encoding
+    from the list given in the <code><a href=#the-form-element>form</a></code>'s <code title=attr-form-accept-charset><a href=#attr-form-accept-charset>accept-charset</a></code> attribute
+    that is an <a href=#ascii-compatible-character-encoding>ASCII-compatible character encoding</a>. If
+    none of the encodings are supported, or if none are listed, then
+    let the selected character encoding be UTF-8.</p>
 
+    <p>Otherwise, if the <a href="#document's-character-encoding">document's character encoding</a> is
+    an <a href=#ascii-compatible-character-encoding>ASCII-compatible character encoding</a>, then that is
+    the selected character encoding.</p>
 
+    <p>Otherwise, let the selected character encoding be UTF-8.</p>
 
+   </li>
+
+   <li><p>Let <var title="">charset</var> be the <a href=#preferred-mime-name>preferred MIME
+   name</a> of the selected character encoding.</li>
+
+   <li>
+
+    <p>For each entry in the <var title="">form data set</var>,
+    perform these substeps:</p>
+
+    <ol><li><p>If the entry's name is "<code title=attr-fe-name-charset><a href=#attr-fe-name-charset>_charset_</a></code>"
+     and its type is "<code title="">hidden</code>", replace its value
+     with <var title="">charset</var>.</li>
+
+     <!-- the step that replaces a file with its name is missing in
+     this version of the algorithm -->
+
+     <li><p>For each character in the entry's name and value that
+     cannot be expressed using the selected character encoding,
+     replace the character by a string consisting of a U+0026
+     AMPERSAND character (&), a U+0023 NUMBER SIGN character (#),
+     one or more characters in the range U+0030 DIGIT ZERO (0) to
+     U+0039 DIGIT NINE (9) representing the Unicode code point of the
+     character in base ten, and finally a U+003B SEMICOLON character
+     (;).</li><!-- we should say it should be the shortest
+     possible string, no leading zeros. this whole step is asinine,
+     though, so... -->
+
+     <!-- this is where the similarities with the previous section end -->
+
+    </ol></li>
+
+   <li>
+
+    <p>Encode the (now mutated) <var title="">form data set</var>
+    using the rules described by RFC 2388, <cite>Returning Values from
+    Forms: <code title="">multipart/form-data</code></cite>, and
+    return the resulting byte stream. <a href=#refsRFC2388>[RFC2388]</a></p>
+
+    <p>Each entry in the <var title="">form data set</var> is a
+    <i>field</i>, the name of the entry is the <i>field name</i> and
+    the value of the entry is the <i>field value</i>.</p>
+
+    <p>The order of parts must be the same as the order of fields in
+    the <var title="">form data set</var>. Multiple entries with the
+    same name must be treated as distinct fields.</p>
+
+    <p>The parts of the generated <code title="">multipart/form-data</code> resource that correspond to
+    non-file fields must not have a <code><a href=#content-type>Content-Type</a></code> header
+    specified. Their names and values must be encoded using the
+    character encoding selected above (field names in particular do
+    not get converted to a 7-bit safe encoding as suggested in RFC
+    2388).</p>
+
+   </li>
+
+  </ol></div>
+
+
+
   <div class=impl>
 
   <h5 id=plain-text-form-data><span class=secno>4.10.21.6 </span>Plain text form data</h5>

Modified: source
===================================================================
--- source	2010-10-11 22:32:31 UTC (rev 5599)
+++ source	2010-10-12 00:10:17 UTC (rev 5600)
@@ -52853,6 +52853,8 @@
 
   <ol>
 
+   <!-- the first few steps of this are the same as in the next section -->
+
    <li><p>Let <var title="">result</var> be the empty string.</p></li>
 
    <li>
@@ -52891,7 +52893,8 @@
      with <var title="">charset</var>.</p></li>
 
      <li><p>If the entry's type is "<code title="">file</code>",
-     replace its value with the file's filename only.</p></li>
+     replace its value with the file's filename only.</p></li> <!--
+     this is not present in the next section -->
 
      <li><p>For each character in the entry's name and value that
      cannot be expressed using the selected character encoding,
@@ -52901,9 +52904,11 @@
      U+0039 DIGIT NINE (9) representing the Unicode code point of the
      character in base ten, and finally a U+003B SEMICOLON character
      (;).</p></li><!-- we should say it should be the shortest
-     possible string, no leading zeros. this whole step as asinine,
+     possible string, no leading zeros. this whole step is asinine,
      though, so... -->
 
+     <!-- this is where the similarities with the next section end -->
+
      <li>
 
       <p>For each character in the entry's name and value, apply the
@@ -53007,24 +53012,100 @@
 
   <h5>Multipart form data</h5>
 
+  <!-- http://hixie.ch/tests/adhoc/html/forms/submission/multipart_form-data/ -->
+
   <p>The <dfn><code title="">multipart/form-data</code> encoding
-  algorithm</dfn> is to encode the <var title="">form data set</var>
-  using the rules described by RFC2388, <cite>Returning Values from
-  Forms: <code title="">multipart/form-data</code></cite>, and return
-  the resulting byte stream. <a href="#refsRFC2388">[RFC2388]</a></p>
+  algorithm</dfn> is as follows:</p>
 
-  <p>Each entry in the <var title="">form data set</var> is a
-  <i>field</i>, the name of the entry is the <i>field name</i> and the
-  value of the entry is the <i>field value</i>, unless the entry's
-  name is "<code title="attr-fe-name-charset">_charset_</code>" and its type is "<code
-  title="">hidden</code>", in which case the <i>field value</i> is the
-  character encoding used by the aforementioned algorithm to encode
-  the value of the field.</p>
+  <ol>
 
-  <p>The order of parts must be the same as the order of fields in the
-  <var title="">form data set</var>. Multiple entries with the same
-  name must be treated as distinct fields.</p>
+   <!-- the first few steps of this are the same as in the previous section -->
 
+   <li><p>Let <var title="">result</var> be the empty string.</p></li>
+
+   <li>
+
+    <p>If the <code>form</code> element has an <code
+    title="attr-form-accept-charset">accept-charset</code> attribute,
+    then, taking into account the characters found in the <var
+    title="">form data set</var>'s names and values, and the character
+    encodings supported by the user agent, select a character encoding
+    from the list given in the <code>form</code>'s <code
+    title="attr-form-accept-charset">accept-charset</code> attribute
+    that is an <span>ASCII-compatible character encoding</span>. If
+    none of the encodings are supported, or if none are listed, then
+    let the selected character encoding be UTF-8.</p>
+
+    <p>Otherwise, if the <span>document's character encoding</span> is
+    an <span>ASCII-compatible character encoding</span>, then that is
+    the selected character encoding.</p>
+
+    <p>Otherwise, let the selected character encoding be UTF-8.</p>
+
+   </li>
+
+   <li><p>Let <var title="">charset</var> be the <span>preferred MIME
+   name</span> of the selected character encoding.</p></li>
+
+   <li>
+
+    <p>For each entry in the <var title="">form data set</var>,
+    perform these substeps:</p>
+
+    <ol>
+
+     <li><p>If the entry's name is "<code title="attr-fe-name-charset">_charset_</code>"
+     and its type is "<code title="">hidden</code>", replace its value
+     with <var title="">charset</var>.</p></li>
+
+     <!-- the step that replaces a file with its name is missing in
+     this version of the algorithm -->
+
+     <li><p>For each character in the entry's name and value that
+     cannot be expressed using the selected character encoding,
+     replace the character by a string consisting of a U+0026
+     AMPERSAND character (&), a U+0023 NUMBER SIGN character (#),
+     one or more characters in the range U+0030 DIGIT ZERO (0) to
+     U+0039 DIGIT NINE (9) representing the Unicode code point of the
+     character in base ten, and finally a U+003B SEMICOLON character
+     (;).</p></li><!-- we should say it should be the shortest
+     possible string, no leading zeros. this whole step is asinine,
+     though, so... -->
+
+     <!-- this is where the similarities with the previous section end -->
+
+    </ol>
+
+   </li>
+
+   <li>
+
+    <p>Encode the (now mutated) <var title="">form data set</var>
+    using the rules described by RFC 2388, <cite>Returning Values from
+    Forms: <code title="">multipart/form-data</code></cite>, and
+    return the resulting byte stream. <a
+    href="#refsRFC2388">[RFC2388]</a></p>
+
+    <p>Each entry in the <var title="">form data set</var> is a
+    <i>field</i>, the name of the entry is the <i>field name</i> and
+    the value of the entry is the <i>field value</i>.</p>
+
+    <p>The order of parts must be the same as the order of fields in
+    the <var title="">form data set</var>. Multiple entries with the
+    same name must be treated as distinct fields.</p>
+
+    <p>The parts of the generated <code
+    title="">multipart/form-data</code> resource that correspond to
+    non-file fields must not have a <code>Content-Type</code> header
+    specified. Their names and values must be encoded using the
+    character encoding selected above (field names in particular do
+    not get converted to a 7-bit safe encoding as suggested in RFC
+    2388).</p>
+
+   </li>
+
+  </ol>
+
   </div>
 
 




More information about the Commit-Watchers mailing list