[html5] r1272 - /

Thu Feb 28 13:29:01 PST 2008

Author: ianh
Date: 2008-02-28 13:28:58 -0800 (Thu, 28 Feb 2008)
New Revision: 1272

Modified:
   index
   source
Log:
[t] (1) Make UTF-16 turn to UTF-8 if the encoding is detected in an ASCII-compatible manner. Clarify some other things in the encoding detection algorithm.

Modified: index
===================================================================

--- index	2008-02-28 08:05:49 UTC (rev 1271)
+++ index	2008-02-28 21:28:58 UTC (rev 1272)
@@ -38095,263 +38095,252 @@
            overall "two step" algorithm.
 
          <li>
-          <p>Examine the attribute's name:</p>
+          <p>If the attribute's name is neither "<code
+           title="">charset</code>" nor "<code title="">content</code>", then
+           return to step 2 in these inner steps.
 
-          <dl class=switch>
-           <dt>If it is 'charset'
+         <li>
+          <p>If the attribute's name is "<code title="">charset</code>", let
+           <var title="">charset</var> be the attribute's value, interpreted
+           as a character encoding.
 
-           <dd>
-            <p>If the attribute's value is a supported character encoding,
-             then return the given encoding, with <a href="#confidence"
-             title=concept-encoding-confidence>confidence</a>
-             <i>tentative</i>, and abort all these steps. Otherwise, do
-             nothing with this attribute, and continue looking for other
-             attributes.
+         <li>
+          <p>Otherwise, the attribute's name is "<code
+           title="">content</code>": apply the <a
+           href="#algorithm3">algorithm for extracting an encoding from a
+           Content-Type</a>, giving the attribute's value as the string to
+           parse. If an encoding is returned, let <var title="">charset</var>
+           be that encoding. Otherwise, return to step 2 in these inner
+           steps.
+        </ol>
 
-           <dt>If it is 'content'
+        <p>If <var title="">charset</var> is a UTF-16 encoding, change it to
+         UTF-8.</p>
 
-           <dd>
-            <p>The attribute's value is now parsed.</p>
+        <p>If <var title="">charset</var> is a supported character encoding,
+         then return the given encoding, with <a href="#confidence"
+         title=concept-encoding-confidence>confidence</a> <i>tentative</i>,
+         and abort all these steps.</p>
+      </dl>
 
-            <ol>
-             <li>Apply the <a href="#algorithm3">algorithm for extracting an
-              encoding from a Content-Type</a>, giving the attribute's value
-              as the string to parse.
+     <li>
+      <p>Otherwise, return to step 2 in these inner steps.
+    </ol>
 
-             <li>If an encoding was returned, and it is the name of a
-              supported character encoding, then return that encoding, with
-              the <a href="#confidence"
-              title=concept-encoding-confidence>confidence</a>
-              <i>tentative</i>, and abort all these steps.
+    <dl>
+     <dt>A sequence of bytes starting with a 0x3C byte (ASCII '<'),
+      optionally a 0x2F byte (ASCII '/'), and finally a byte in the range
+      0x41-0x5A or 0x61-0x7A (an ASCII letter)
 
-             <li>Otherwise, skip this 'content' attribute and continue on
-              with any other attributes.
-            </ol>
+     <dd>
+      <ol>
+       <li>
+        <p>Advance the <var title="">position</var> pointer so that it points
+         at the next 0x09 (ASCII TAB), 0x0A (ASCII LF), 0x0B (ASCII VT), 0x0C
+         (ASCII FF), 0x0D (ASCII CR), 0x20 (ASCII space), or 0x3E (ASCII '>')
+         byte.
 
-           <dd>
+       <li>
+        <p>Repeatedly <a href="#get-an"
+         title=concept-get-attributes-when-sniffing>get an attribute</a>
+         until no further attributes can be found, then jump to the second
+         step in the overall "two step" algorithm.
+      </ol>
 
-           <dt>Any other name
+     <dt>A sequence of bytes starting with: 0x3C 0x21 (ASCII '<!')
 
-           <dd>
-            <p>Do nothing with that attribute.
-          </dl>
+     <dt>A sequence of bytes starting with: 0x3C 0x2F (ASCII '</')
 
-         <li>
-          <p>Return to step 1 in these inner steps.
-        </ol>
+     <dt>A sequence of bytes starting with: 0x3C 0x3F (ASCII '<?')
 
-       <dt>A sequence of bytes starting with a 0x3C byte (ASCII '<'),
-        optionally a 0x2F byte (ASCII '/'), and finally a byte in the range
-        0x41-0x5A or 0x61-0x7A (an ASCII letter)
+     <dd>
+      <p>Advance the <var title="">position</var> pointer so that it points
+       at the first 0x3E byte (ASCII '>') that comes after the 0x3C byte that
+       was found.</p>
 
-       <dd>
-        <ol>
-         <li>
-          <p>Advance the <var title="">position</var> pointer so that it
-           points at the next 0x09 (ASCII TAB), 0x0A (ASCII LF), 0x0B (ASCII
-           VT), 0x0C (ASCII FF), 0x0D (ASCII CR), 0x20 (ASCII space), or 0x3E
-           (ASCII '>') byte.
+     <dt>Any other byte
 
-         <li>
-          <p>Repeatedly <a href="#get-an"
-           title=concept-get-attributes-when-sniffing>get an attribute</a>
-           until no further attributes can be found, then jump to the second
-           step in the overall "two step" algorithm.
-        </ol>
+     <dd>
+      <p>Do nothing with that byte.</p>
+    </dl>
 
-       <dt>A sequence of bytes starting with: 0x3C 0x21 (ASCII '<!')
+   <li>Move <var title="">position</var> so it points at the next byte in the
+    input stream, and return to the first step of this "two step" algorithm.
+  </ol>
 
-       <dt>A sequence of bytes starting with: 0x3C 0x2F (ASCII '</')
+  <p>When the above "two step" algorithm says to <dfn id=get-an
+   title=concept-get-attributes-when-sniffing>get an attribute</dfn>, it
+   means doing this:
 
-       <dt>A sequence of bytes starting with: 0x3C 0x3F (ASCII '<?')
+  <ol>
+   <li>
+    <p>If the byte at <var title="">position</var> is one of 0x09 (ASCII
+     TAB), 0x0A (ASCII LF), 0x0B (ASCII VT), 0x0C (ASCII FF), 0x0D (ASCII
+     CR), 0x20 (ASCII space), or 0x2F (ASCII '/') then advance <var
+     title="">position</var> to the next byte and redo this substep.
 
-       <dd>
-        <p>Advance the <var title="">position</var> pointer so that it points
-         at the first 0x3E byte (ASCII '>') that comes after the 0x3C byte
-         that was found.</p>
+   <li>
+    <p>If the byte at <var title="">position</var> is 0x3E (ASCII '>'), then
+     abort the "get an attribute" algorithm. There isn't one.
 
-       <dt>Any other byte
+   <li>
+    <p>Otherwise, the byte at <var title="">position</var> is the start of
+     the attribute name. Let <var title="">attribute name</var> and <var
+     title="">attribute value</var> be the empty string.
 
-       <dd>
-        <p>Do nothing with that byte.</p>
-      </dl>
+   <li>
+    <p><em>Attribute name</em>: Process the byte at <var
+     title="">position</var> as follows:</p>
 
-     <li>Move <var title="">position</var> so it points at the next byte in
-      the input stream, and return to the first step of this "two step"
-      algorithm.
-    </ol>
+    <dl class=switch>
+     <dt>If it is 0x3D (ASCII '='), and the <var title="">attribute
+      name</var> is longer than the empty string
 
-    <p>When the above "two step" algorithm says to <dfn id=get-an
-     title=concept-get-attributes-when-sniffing>get an attribute</dfn>, it
-     means doing this:</p>
+     <dd>Advance <var title="">position</var> to the next byte and jump to
+      the step below labelled <em>value</em>.
 
-    <ol>
-     <li>
-      <p>If the byte at <var title="">position</var> is one of 0x09 (ASCII
-       TAB), 0x0A (ASCII LF), 0x0B (ASCII VT), 0x0C (ASCII FF), 0x0D (ASCII
-       CR), 0x20 (ASCII space), or 0x2F (ASCII '/') then advance <var
-       title="">position</var> to the next byte and redo this substep.
+     <dt>If it is 0x09 (ASCII TAB), 0x0A (ASCII LF), 0x0B (ASCII VT), 0x0C
+      (ASCII FF), 0x0D (ASCII CR), or 0x20 (ASCII space)
 
-     <li>
-      <p>If the byte at <var title="">position</var> is 0x3E (ASCII '>'),
-       then stop looking for an attribute. There isn't one.
+     <dd>Jump to the step below labelled <em>spaces</em>.
 
-     <li>
-      <p>Otherwise, the byte at <var title="">position</var> is the start of
-       the attribute name. Let <var title="">attribute name</var> and <var
-       title="">attribute value</var> be the empty string.
+     <dt>If it is 0x2F (ASCII '/') or 0x3E (ASCII '>')
 
-     <li>
-      <p><em>Attribute name</em>: Process the byte at <var
-       title="">position</var> as follows:</p>
+     <dd>Abort the "get an attribute" algorithm. The attribute's name is the
+      value of <var title="">attribute name</var>, its value is the empty
+      string.
 
-      <dl class=switch>
-       <dt>If it is 0x3D (ASCII '='), and the <var title="">attribute
-        name</var> is longer than the empty string
+     <dt>If it is in the range 0x41 (ASCII 'A') to 0x5A (ASCII 'Z')
 
-       <dd>Advance <var title="">position</var> to the next byte and jump to
-        the step below labelled <em>value</em>.
+     <dd>Append the Unicode character with codepoint <span><var
+      title="">b</var>+0x20</span> to <var title="">attribute name</var>
+      (where <var title="">b</var> is the value of the byte at <var
+      title="">position</var>).
 
-       <dt>If it is 0x09 (ASCII TAB), 0x0A (ASCII LF), 0x0B (ASCII VT), 0x0C
-        (ASCII FF), 0x0D (ASCII CR), or 0x20 (ASCII space)
+     <dt>Anything else
 
-       <dd>Jump to the step below labelled <em>spaces</em>.
+     <dd>Append the Unicode character with the same codepoint as the value of
+      the byte at <var title="">position</var>) to <var title="">attribute
+      name</var>. (It doesn't actually matter how bytes outside the ASCII
+      range are handled here, since only ASCII characters can contribute to
+      the detection of a character encoding.)
+    </dl>
 
-       <dt>If it is 0x2F (ASCII '/') or 0x3E (ASCII '>')
+   <li>
+    <p>Advance <var title="">position</var> to the next byte and return to
+     the previous step.
 
-       <dd>Stop looking for an attribute. The attribute's name is the value
-        of <var title="">attribute name</var>, its value is the empty string.
+   <li>
+    <p><em>Spaces.</em> If the byte at <var title="">position</var> is one of
+     0x09 (ASCII TAB), 0x0A (ASCII LF), 0x0B (ASCII VT), 0x0C (ASCII FF),
+     0x0D (ASCII CR), or 0x20 (ASCII space) then advance <var
+     title="">position</var> to the next byte, then, repeat this step.
 
-       <dt>If it is in the range 0x41 (ASCII 'A') to 0x5A (ASCII 'Z')
+   <li>
+    <p>If the byte at <var title="">position</var> is <em>not</em> 0x3D
+     (ASCII '='), abort the "get an attribute" algorithm. Move <var
+     title="">position</var> back to the previous byte. The attribute's name
+     is the value of <var title="">attribute name</var>, its value is the
+     empty string.
 
-       <dd>Append the Unicode character with codepoint <span><var
-        title="">b</var>+0x20</span> to <var title="">attribute name</var>
-        (where <var title="">b</var> is the value of the byte at <var
-        title="">position</var>).
+   <li>
+    <p>Advance <var title="">position</var> past the 0x3D (ASCII '=') byte.
 
-       <dt>Anything else
+   <li>
+    <p><em>Value.</em> If the byte at <var title="">position</var> is one of
+     0x09 (ASCII TAB), 0x0A (ASCII LF), 0x0B (ASCII VT), 0x0C (ASCII FF),
+     0x0D (ASCII CR), or 0x20 (ASCII space) then advance <var
+     title="">position</var> to the next byte, then, repeat this step.
 
-       <dd>Append the Unicode character with the same codepoint as the value
-        of the byte at <var title="">position</var>) to <var
-        title="">attribute name</var>. (It doesn't actually matter how bytes
-        outside the ASCII range are handled here, since only ASCII characters
-        can contribute to the detection of a character encoding.)
-      </dl>
+   <li>
+    <p>Process the byte at <var title="">position</var> as follows:</p>
 
-     <li>
-      <p>Advance <var title="">position</var> to the next byte and return to
-       the previous step.
+    <dl class=switch>
+     <dt>If it is 0x22 (ASCII '"') or 0x27 ("'")
 
-     <li>
-      <p><em>Spaces.</em> If the byte at <var title="">position</var> is one
-       of 0x09 (ASCII TAB), 0x0A (ASCII LF), 0x0B (ASCII VT), 0x0C (ASCII
-       FF), 0x0D (ASCII CR), or 0x20 (ASCII space) then advance <var
-       title="">position</var> to the next byte, then, repeat this step.
+     <dd>
+      <ol>
+       <li>Let <var title="">b</var> be the value of the byte at <var
+        title="">position</var>.
 
-     <li>
-      <p>If the byte at <var title="">position</var> is <em>not</em> 0x3D
-       (ASCII '='), stop looking for an attribute. Move <var
-       title="">position</var> back to the previous byte. The attribute's
-       name is the value of <var title="">attribute name</var>, its value is
-       the empty string.
+       <li>Advance <var title="">position</var> to the next byte.
 
-     <li>
-      <p>Advance <var title="">position</var> past the 0x3D (ASCII '=') byte.
+       <li>If the value of the byte at <var title="">position</var> is the
+        value of <var title="">b</var>, then abort the "get an attribute"
+        algorithm. The attribute's name is the value of <var
+        title="">attribute name</var>, and its value is the value of <var
+        title="">attribute value</var>.
 
-     <li>
-      <p><em>Value.</em> If the byte at <var title="">position</var> is one
-       of 0x09 (ASCII TAB), 0x0A (ASCII LF), 0x0B (ASCII VT), 0x0C (ASCII
-       FF), 0x0D (ASCII CR), or 0x20 (ASCII space) then advance <var
-       title="">position</var> to the next byte, then, repeat this step.
+       <li>Otherwise, if the value of the byte at <var
+        title="">position</var> is in the range 0x41 (ASCII 'A') to 0x5A
+        (ASCII 'Z'), then append a Unicode character to <var
+        title="">attribute value</var> whose codepoint is 0x20 more than the
+        value of the byte at <var title="">position</var>.
 
-     <li>
-      <p>Process the byte at <var title="">position</var> as follows:</p>
+       <li>Otherwise, append a Unicode character to <var title="">attribute
+        value</var> whose codepoint is the same as the value of the byte at
+        <var title="">position</var>.
 
-      <dl class=switch>
-       <dt>If it is 0x22 (ASCII '"') or 0x27 ("'")
+       <li>Return to the second step in these substeps.
+      </ol>
 
-       <dd>
-        <ol>
-         <li>Let <var title="">b</var> be the value of the byte at <var
-          title="">position</var>.
+     <dt>If it is 0x3E (ASCII '>')
 
-         <li>Advance <var title="">position</var> to the next byte.
+     <dd>Abort the "get an attribute" algorithm. The attribute's name is the
+      value of <var title="">attribute name</var>, its value is the empty
+      string.
 
-         <li>If the value of the byte at <var title="">position</var> is the
-          value of <var title="">b</var>, then stop looking for an attribute.
-          The attribute's name is the value of <var title="">attribute
-          name</var>, and its value is the value of <var title="">attribute
-          value</var>.
+     <dt>If it is in the range 0x41 (ASCII 'A') to 0x5A (ASCII 'Z')
 
-         <li>Otherwise, if the value of the byte at <var
-          title="">position</var> is in the range 0x41 (ASCII 'A') to 0x5A
-          (ASCII 'Z'), then append a Unicode character to <var
-          title="">attribute value</var> whose codepoint is 0x20 more than
-          the value of the byte at <var title="">position</var>.
+     <dd>Append the Unicode character with codepoint <span><var
+      title="">b</var>+0x20</span> to <var title="">attribute value</var>
+      (where <var title="">b</var> is the value of the byte at <var
+      title="">position</var>). Advance <var title="">position</var> to the
+      next byte.
 
-         <li>Otherwise, append a Unicode character to <var title="">attribute
-          value</var> whose codepoint is the same as the value of the byte at
-          <var title="">position</var>.
+     <dt>Anything else
 
-         <li>Return to the second step in these substeps.
-        </ol>
+     <dd>Append the Unicode character with the same codepoint as the value of
+      the byte at <var title="">position</var>) to <var title="">attribute
+      value</var>. Advance <var title="">position</var> to the next byte.
+    </dl>
 
-       <dt>If it is 0x3E (ASCII '>')
+   <li>
+    <p>Process the byte at <var title="">position</var> as follows:</p>
 
-       <dd>Stop looking for an attribute. The attribute's name is the value
-        of <var title="">attribute name</var>, its value is the empty string.
+    <dl class=switch>
+     <dt>If it is 0x09 (ASCII TAB), 0x0A (ASCII LF), 0x0B (ASCII VT), 0x0C
+      (ASCII FF), 0x0D (ASCII CR), 0x20 (ASCII space), or 0x3E (ASCII '>')
 
-       <dt>If it is in the range 0x41 (ASCII 'A') to 0x5A (ASCII 'Z')
+     <dd>Abort the "get an attribute" algorithm. The attribute's name is the
+      value of <var title="">attribute name</var> and its value is the value
+      of <var title="">attribute value</var>.
 
-       <dd>Append the Unicode character with codepoint <span><var
-        title="">b</var>+0x20</span> to <var title="">attribute value</var>
-        (where <var title="">b</var> is the value of the byte at <var
-        title="">position</var>). Advance <var title="">position</var> to the
-        next byte.
+     <dt>If it is in the range 0x41 (ASCII 'A') to 0x5A (ASCII 'Z')
 
-       <dt>Anything else
+     <dd>Append the Unicode character with codepoint <span><var
+      title="">b</var>+0x20</span> to <var title="">attribute value</var>
+      (where <var title="">b</var> is the value of the byte at <var
+      title="">position</var>).
 
-       <dd>Append the Unicode character with the same codepoint as the value
-        of the byte at <var title="">position</var>) to <var
-        title="">attribute value</var>. Advance <var title="">position</var>
-        to the next byte.
-      </dl>
+     <dt>Anything else
 
-     <li>
-      <p>Process the byte at <var title="">position</var> as follows:</p>
+     <dd>Append the Unicode character with the same codepoint as the value of
+      the byte at <var title="">position</var>) to <var title="">attribute
+      value</var>.
+    </dl>
 
-      <dl class=switch>
-       <dt>If it is 0x09 (ASCII TAB), 0x0A (ASCII LF), 0x0B (ASCII VT), 0x0C
-        (ASCII FF), 0x0D (ASCII CR), 0x20 (ASCII space), or 0x3E (ASCII '>')
+   <li>
+    <p>Advance <var title="">position</var> to the next byte and return to
+     the previous step.
+  </ol>
 
-       <dd>Stop looking for an attribute. The attribute's name is the value
-        of <var title="">attribute name</var> and its value is the value of
-        <var title="">attribute value</var>.
+  <p>For the sake of interoperability, user agents should not use a pre-scan
+   algorithm that returns different results than the one described above.
+   (But, if you do, please at least let us know, so that we can improve this
+   algorithm and benefit everyone...)
 
-       <dt>If it is in the range 0x41 (ASCII 'A') to 0x5A (ASCII 'Z')
-
-       <dd>Append the Unicode character with codepoint <span><var
-        title="">b</var>+0x20</span> to <var title="">attribute value</var>
-        (where <var title="">b</var> is the value of the byte at <var
-        title="">position</var>).
-
-       <dt>Anything else
-
-       <dd>Append the Unicode character with the same codepoint as the value
-        of the byte at <var title="">position</var>) to <var
-        title="">attribute value</var>.
-      </dl>
-
-     <li>
-      <p>Advance <var title="">position</var> to the next byte and return to
-       the previous step.
-    </ol>
-
-    <p>For the sake of interoperability, user agents should not use a
-     pre-scan algorithm that returns different results than the one described
-     above. (But, if you do, please at least let us know, so that we can
-     improve this algorithm and benefit everyone...)</p>
-
+  <ul>
    <li>
     <p>If the user agent has information on the likely encoding for this
      page, e.g. based on the encoding of the page when it was last visited,
@@ -38381,7 +38370,7 @@
      title="">UTF-8</code> encoding is recommended instead. Since these
      encodings can in many cases be distinguished by inspection, a user agent
      may heuristically decide which to use as a default.
-  </ol>
+  </ul>
 
   <h5 id=character0><span class=secno>8.2.2.2. </span>Character encoding
    requirements</h5>

Modified: source
===================================================================
--- source	2008-02-28 08:05:49 UTC (rev 1271)
+++ source	2008-02-28 21:28:58 UTC (rev 1272)
@@ -35619,54 +35619,34 @@
          sniffed, then skip this inner set of steps, and jump to the
          second step in the overall "two step" algorithm.</p></li>
 
-         <li><p>Examine the attribute's name:</p>
+         <li><p>If the attribute's name is neither "<code
+         title="">charset</code>" nor "<code title="">content</code>",
+         then return to step 2 in these inner steps.</p></li>
 
-          <dl class="switch">
+         <li><p>If the attribute's name is "<code
+         title="">charset</code>", let <var title="">charset</var> be
+         the attribute's value, interpreted as a character
+         encoding.</p></li>
 
-           <dt>If it is 'charset'</dt>
+         <li><p>Otherwise, the attribute's name is "<code
+         title="">content</code>": apply the <span>algorithm for
+         extracting an encoding from a Content-Type</span>, giving the
+         attribute's value as the string to parse. If an encoding is
+         returned, let <var title="">charset</var> be that
+         encoding. Otherwise, return to step 2 in these inner
+         steps.</li>
 
-           <dd><p>If the attribute's value is a supported character
-           encoding, then return the given encoding, with <span
-           title="concept-encoding-confidence">confidence</span>
-           <i>tentative</i>, and abort all these steps. Otherwise, do
-           nothing with this attribute, and continue looking for other
-           attributes.</p></dd>
+         <p>If <var title="">charset</var> is a UTF-16 encoding,
+         change it to UTF-8.</p>
 
-           <dt>If it is 'content'</dt>
+         <p>If <var title="">charset</var> is a supported character
+         encoding, then return the given encoding, with <span
+         title="concept-encoding-confidence">confidence</span>
+         <i>tentative</i>, and abort all these steps.</p>
 
-           <dd>
+         <li><p>Otherwise, return to step 2 in these inner
+         steps.</p></li>
 
-            <p>The attribute's value is now parsed.</p>
-
-            <ol>
-
-             <li>Apply the <span>algorithm for extracting an encoding
-             from a Content-Type</span>, giving the attribute's value
-             as the string to parse.</li>
-
-             <li>If an encoding was returned, and it is the name of a
-             supported character encoding, then return that encoding,
-             with the <span
-             title="concept-encoding-confidence">confidence</span>
-             <i>tentative</i>, and abort all these steps.</li>
-
-             <li>Otherwise, skip this 'content' attribute and continue
-             on with any other attributes.</li>
-
-            </ol>
-
-           <dd>
-
-           <dt>Any other name</dt>
-
-           <dd><p>Do nothing with that attribute.</p></dd>
-
-          </dl>
-
-         </li>
-
-         <li><p>Return to step 1 in these inner steps.</p></li>
-
         </ol>
 
        </dd>
@@ -35732,7 +35712,7 @@
      this substep.</p></li>
 
      <li><p>If the byte at <var title="">position</var> is 0x3E (ASCII
-     '>'), then stop looking for an attribute. There isn't
+     '>'), then abort the "get an attribute" algorithm. There isn't
      one.</p></li>
 
      <li><p>Otherwise, the byte at <var title="">position</var> is the
@@ -35759,9 +35739,9 @@
 
        <dt>If it is 0x2F (ASCII '/') or 0x3E (ASCII '>')</dt>
 
-       <dd>Stop looking for an attribute. The attribute's name is the
-       value of <var title="">attribute name</var>, its value is the
-       empty string.</dd>
+       <dd>Abort the "get an attribute" algorithm. The attribute's
+       name is the value of <var title="">attribute name</var>, its
+       value is the empty string.</dd>
 
        <dt>If it is in the range 0x41 (ASCII 'A') to 0x5A (ASCII
        'Z')</dt>
@@ -35794,8 +35774,8 @@
      next byte, then, repeat this step.</p></li>
 
      <li><p>If the byte at <var title="">position</var> is
-     <em>not</em> 0x3D (ASCII '='), stop looking for an
-     attribute. Move <var title="">position</var> back to the previous
+     <em>not</em> 0x3D (ASCII '='), abort the "get an attribute"
+     algorithm. Move <var title="">position</var> back to the previous
      byte. The attribute's name is the value of <var
      title="">attribute name</var>, its value is the empty
      string.</p></li>
@@ -35827,10 +35807,10 @@
          byte.</li>
 
          <li>If the value of the byte at <var title="">position</var>
-         is the value of <var title="">b</var>, then stop looking for
-         an attribute. The attribute's name is the value of <var
-         title="">attribute name</var>, and its value is the value of
-         <var title="">attribute value</var>.</li>
+         is the value of <var title="">b</var>, then abort the "get an
+         attribute" algorithm. The attribute's name is the value of
+         <var title="">attribute name</var>, and its value is the
+         value of <var title="">attribute value</var>.</li>
 
          <li>Otherwise, if the value of the byte at <var
          title="">position</var> is in the range 0x41 (ASCII 'A') to
@@ -35851,9 +35831,9 @@
 
        <dt>If it is 0x3E (ASCII '>')</dt>
 
-       <dd>Stop looking for an attribute. The attribute's name is the
-       value of <var title="">attribute name</var>, its value is the
-       empty string.</dd>
+       <dd>Abort the "get an attribute" algorithm. The attribute's
+       name is the value of <var title="">attribute name</var>, its
+       value is the empty string.</dd>
 
 
        <dt>If it is in the range 0x41 (ASCII 'A') to 0x5A (ASCII
@@ -35885,9 +35865,9 @@
        VT), 0x0C (ASCII FF), 0x0D (ASCII CR), 0x20 (ASCII space), or
        0x3E (ASCII '>')</dt>
 
-       <dd>Stop looking for an attribute. The attribute's name is the
-       value of <var title="">attribute name</var> and its value is the
-       value of <var title="">attribute value</var>.</dd>
+       <dd>Abort the "get an attribute" algorithm. The attribute's
+       name is the value of <var title="">attribute name</var> and its
+       value is the value of <var title="">attribute value</var>.</dd>
 
        <dt>If it is in the range 0x41 (ASCII 'A') to 0x5A (ASCII
        'Z')</dt>