[html5] r4970 - [giowt] (2) Be more compatible with what browsers do with multibyte characters i [...]

whatwg at whatwg.org whatwg at whatwg.org
Sun Apr 4 21:36:53 PDT 2010


Author: ianh
Date: 2010-04-04 21:36:51 -0700 (Sun, 04 Apr 2010)
New Revision: 4970

Modified:
   complete.html
   index
   source
Log:
[giowt] (2) Be more compatible with what browsers do with multibyte characters in submissions.

Modified: complete.html
===================================================================
--- complete.html	2010-04-04 10:04:16 UTC (rev 4969)
+++ complete.html	2010-04-05 04:36:51 UTC (rev 4970)
@@ -181,7 +181,7 @@
 
   <header class=head id=head><p><a class=logo href=http://www.whatwg.org/ rel=home><img alt=WHATWG src=/images/logo></a></p>
    <hgroup><h1>Web Applications 1.0</h1>
-    <h2 class="no-num no-toc">Draft Standard — 4 April 2010</h2>
+    <h2 class="no-num no-toc">Draft Standard — 5 April 2010</h2>
    </hgroup><p>You can take part in this work. <a href=http://www.whatwg.org/mailing-list>Join the working group's discussion list.</a></p>
    <p><strong>Web designers!</strong> We have a <a href=http://blog.whatwg.org/faq/>FAQ</a>, a <a href=http://forums.whatwg.org/>forum</a>, and a <a href=http://www.whatwg.org/mailing-list#help>help mailing list</a> for you!</p>
    <!--<p class="impl"><strong>Implementors!</strong> We have a <a href="http://www.whatwg.org/mailing-list#implementors">mailing list</a> for you too!</p>-->
@@ -42165,25 +42165,57 @@
      <li>
 
       <p>For each character in the entry's name and value, apply the
-      following subsubsteps:</p>
+      appropriate subsubsteps from the following list:</p>
 
-      <ol><!-- * - . _ 0-9 a-z A-Z --><li><p>If the character isn't in the range U+0020, U+002A,
+      <dl class=switch><dt>The character is a U+0020 SPACE character</dt>
+
+       <dd>Replace the character with a single U+002B PLUS SIGN
+       character (+).</dd>
+
+
+       <!-- * - . _ 0-9 a-z A-Z -->
+
+       <dt>If the character isn't in the range U+0020, U+002A,
        U+002D, U+002E, U+0030 to U+0039, U+0041 to U+005A, U+005F,
-       U+0061 to U+007A then replace the character with a string
-       formed as follows: Start with the empty string, and then,
-       taking each byte of the character when expressed in the
-       selected character encoding in turn, append to the string a
-       U+0025 PERCENT SIGN character (%) followed by two characters in
-       the ranges U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9) and
-       U+0041 LATIN CAPITAL LETTER A to U+0046 LATIN CAPITAL LETTER F
-       representing the hexadecimal value of the byte (zero-padded if
-       necessary).</li>
+       U+0061 to U+007A</dt>
 
-       <li><p>If the character is a U+0020 SPACE character, replace it
-       with a single U+002B PLUS SIGN character (+).</li>
+       <dd>
 
-      </ol></li>
+        <p>Replace the character with a string formed as follows:</p>
 
+        <ol><li><p>Let <var title="">s</var> be an empty string.</li>
+
+         <li>
+
+          <p>For each byte <var title="">b</var> of the character when
+          expressed in the selected character encoding in turn, run
+          the appropriate subsubsubstep from the list below:</p>
+
+          <dl class=switch><dt>If the byte is in the range 0x20, 0x2A, 0x2D, 0x2E,
+           0x30 to 0x39, 0x41 to 0x5A, 0x5F, 0x61 to 0x7A</dt>
+
+           <dd><p>Append to <var title="">s</var> the Unicode
+           character with the codepoint equal to the byte.</dd>
+
+           <dt>Otherwise</dt>
+
+           <dd><p>Append to the string a U+0025 PERCENT SIGN character
+           (%) followed by two characters in the ranges U+0030 DIGIT
+           ZERO (0) to U+0039 DIGIT NINE (9) and U+0041 LATIN CAPITAL
+           LETTER A to U+0046 LATIN CAPITAL LETTER F representing the
+           hexadecimal value of the byte (zero-padded if
+           necessary).</dd>
+
+          </dl></li>
+
+        </ol></dd>
+
+       <dt>Otherwise</dt>
+
+       <dd><p>Leave the character as is.</dd>
+
+      </dl></li>
+
      <li><p>If the entry's name is "<code title="">isindex</code>",
      its type is "<code title="">text</code>", and this is the first
      entry in the <var title="">form data set</var>, then append the

Modified: index
===================================================================
--- index	2010-04-04 10:04:16 UTC (rev 4969)
+++ index	2010-04-05 04:36:51 UTC (rev 4970)
@@ -185,7 +185,7 @@
 
   <header class=head id=head><p><a class=logo href=http://www.whatwg.org/ rel=home><img alt=WHATWG src=/images/logo></a></p>
    <hgroup><h1>HTML5 (including next generation additions still in development)</h1>
-    <h2 class="no-num no-toc">Draft Standard — 4 April 2010</h2>
+    <h2 class="no-num no-toc">Draft Standard — 5 April 2010</h2>
    </hgroup><p>You can take part in this work. <a href=http://www.whatwg.org/mailing-list>Join the working group's discussion list.</a></p>
    <p><strong>Web designers!</strong> We have a <a href=http://blog.whatwg.org/faq/>FAQ</a>, a <a href=http://forums.whatwg.org/>forum</a>, and a <a href=http://www.whatwg.org/mailing-list#help>help mailing list</a> for you!</p>
    <!--<p class="impl"><strong>Implementors!</strong> We have a <a href="http://www.whatwg.org/mailing-list#implementors">mailing list</a> for you too!</p>-->
@@ -42066,25 +42066,57 @@
      <li>
 
       <p>For each character in the entry's name and value, apply the
-      following subsubsteps:</p>
+      appropriate subsubsteps from the following list:</p>
 
-      <ol><!-- * - . _ 0-9 a-z A-Z --><li><p>If the character isn't in the range U+0020, U+002A,
+      <dl class=switch><dt>The character is a U+0020 SPACE character</dt>
+
+       <dd>Replace the character with a single U+002B PLUS SIGN
+       character (+).</dd>
+
+
+       <!-- * - . _ 0-9 a-z A-Z -->
+
+       <dt>If the character isn't in the range U+0020, U+002A,
        U+002D, U+002E, U+0030 to U+0039, U+0041 to U+005A, U+005F,
-       U+0061 to U+007A then replace the character with a string
-       formed as follows: Start with the empty string, and then,
-       taking each byte of the character when expressed in the
-       selected character encoding in turn, append to the string a
-       U+0025 PERCENT SIGN character (%) followed by two characters in
-       the ranges U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9) and
-       U+0041 LATIN CAPITAL LETTER A to U+0046 LATIN CAPITAL LETTER F
-       representing the hexadecimal value of the byte (zero-padded if
-       necessary).</li>
+       U+0061 to U+007A</dt>
 
-       <li><p>If the character is a U+0020 SPACE character, replace it
-       with a single U+002B PLUS SIGN character (+).</li>
+       <dd>
 
-      </ol></li>
+        <p>Replace the character with a string formed as follows:</p>
 
+        <ol><li><p>Let <var title="">s</var> be an empty string.</li>
+
+         <li>
+
+          <p>For each byte <var title="">b</var> of the character when
+          expressed in the selected character encoding in turn, run
+          the appropriate subsubsubstep from the list below:</p>
+
+          <dl class=switch><dt>If the byte is in the range 0x20, 0x2A, 0x2D, 0x2E,
+           0x30 to 0x39, 0x41 to 0x5A, 0x5F, 0x61 to 0x7A</dt>
+
+           <dd><p>Append to <var title="">s</var> the Unicode
+           character with the codepoint equal to the byte.</dd>
+
+           <dt>Otherwise</dt>
+
+           <dd><p>Append to the string a U+0025 PERCENT SIGN character
+           (%) followed by two characters in the ranges U+0030 DIGIT
+           ZERO (0) to U+0039 DIGIT NINE (9) and U+0041 LATIN CAPITAL
+           LETTER A to U+0046 LATIN CAPITAL LETTER F representing the
+           hexadecimal value of the byte (zero-padded if
+           necessary).</dd>
+
+          </dl></li>
+
+        </ol></dd>
+
+       <dt>Otherwise</dt>
+
+       <dd><p>Leave the character as is.</dd>
+
+      </dl></li>
+
      <li><p>If the entry's name is "<code title="">isindex</code>",
      its type is "<code title="">text</code>", and this is the first
      entry in the <var title="">form data set</var>, then append the

Modified: source
===================================================================
--- source	2010-04-04 10:04:16 UTC (rev 4969)
+++ source	2010-04-05 04:36:51 UTC (rev 4970)
@@ -47067,29 +47067,67 @@
      <li>
 
       <p>For each character in the entry's name and value, apply the
-      following subsubsteps:</p>
+      appropriate subsubsteps from the following list:</p>
 
-      <ol>
+      <dl class="switch">
 
+       <dt>The character is a U+0020 SPACE character</dt>
+
+       <dd>Replace the character with a single U+002B PLUS SIGN
+       character (+).</dd>
+
+
        <!-- * - . _ 0-9 a-z A-Z -->
 
-       <li><p>If the character isn't in the range U+0020, U+002A,
+       <dt>If the character isn't in the range U+0020, U+002A,
        U+002D, U+002E, U+0030 to U+0039, U+0041 to U+005A, U+005F,
-       U+0061 to U+007A then replace the character with a string
-       formed as follows: Start with the empty string, and then,
-       taking each byte of the character when expressed in the
-       selected character encoding in turn, append to the string a
-       U+0025 PERCENT SIGN character (%) followed by two characters in
-       the ranges U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9) and
-       U+0041 LATIN CAPITAL LETTER A to U+0046 LATIN CAPITAL LETTER F
-       representing the hexadecimal value of the byte (zero-padded if
-       necessary).</p></li>
+       U+0061 to U+007A</dt>
 
-       <li><p>If the character is a U+0020 SPACE character, replace it
-       with a single U+002B PLUS SIGN character (+).</p></li>
+       <dd>
 
-      </ol>
+        <p>Replace the character with a string formed as follows:</p>
 
+        <ol>
+
+         <li><p>Let <var title="">s</var> be an empty string.</p></li>
+
+         <li>
+
+          <p>For each byte <var title="">b</var> of the character when
+          expressed in the selected character encoding in turn, run
+          the appropriate subsubsubstep from the list below:</p>
+
+          <dl class="switch">
+
+           <dt>If the byte is in the range 0x20, 0x2A, 0x2D, 0x2E,
+           0x30 to 0x39, 0x41 to 0x5A, 0x5F, 0x61 to 0x7A</dt>
+
+           <dd><p>Append to <var title="">s</var> the Unicode
+           character with the codepoint equal to the byte.</p></dd>
+
+           <dt>Otherwise</dt>
+
+           <dd><p>Append to the string a U+0025 PERCENT SIGN character
+           (%) followed by two characters in the ranges U+0030 DIGIT
+           ZERO (0) to U+0039 DIGIT NINE (9) and U+0041 LATIN CAPITAL
+           LETTER A to U+0046 LATIN CAPITAL LETTER F representing the
+           hexadecimal value of the byte (zero-padded if
+           necessary).</p></dd>
+
+          </dl>
+
+         </li>
+
+        </ol>
+
+       </dd>
+
+       <dt>Otherwise</dt>
+
+       <dd><p>Leave the character as is.</p></dd>
+
+      </dl>
+
      </li>
 
      <li><p>If the entry's name is "<code title="">isindex</code>",




More information about the Commit-Watchers mailing list