[html5] r1823 - [] (0) Make backslashes turn into forward slashes when parsing URLs. Sigh.

whatwg at whatwg.org whatwg at whatwg.org
Fri Jun 27 16:03:53 PDT 2008


Author: ianh
Date: 2008-06-27 16:03:53 -0700 (Fri, 27 Jun 2008)
New Revision: 1823

Modified:
   index
   source
Log:
[] (0) Make backslashes turn into forward slashes when parsing URLs. Sigh.

Modified: index
===================================================================
--- index	2008-06-27 21:57:53 UTC (rev 1822)
+++ index	2008-06-27 23:03:53 UTC (rev 1823)
@@ -2722,107 +2722,120 @@
   <h4 id=parsing0><span class=secno>2.3.2 </span>Parsing URLs</h4>
 
   <p>To <dfn id=parse0>parse a URL</dfn> <var title="">url</var> into its
-   component parts, the user agent must first strip leading and trailing <a
-   href="#space" title="space character">space characters</a> from <var
-   title="">url</var>, and then must parse <var title="">url</var> in the
-   manner defined by RFC 3986, with the following exceptions:
+   component parts, the user agent must use the following steps:
 
-  <ul>
-   <li>Add all characters with codepoints less than or equal to U+0020 or
-    greater than or equal to U+007F to the <unreserved> production.
+  <ol>
+   <li>
+    <p>Strip leading and trailing <a href="#space" title="space
+     character">space characters</a> from <var title="">url</var>.
 
-   <li>Add the characters U+0022, U+003C, U+003E, U+005B .. U+005E, U+0060,
-    and U+007B .. U+007D to the <unreserved> production. <!--
-     0022 QUOTATION MARK
-     003C LESS-THAN SIGN
-     003E GREATER-THAN SIGN
-     005B LEFT SQUARE BRACKET
-     005C REVERSE SOLIDUS
-     005D RIGHT SQUARE BRACKET
-     005E CIRCUMFLEX ACCENT
-     0060 GRAVE ACCENT
-     007B LEFT CURLY BRACKET
-     007C VERTICAL LINE
-     007D RIGHT CURLY BRACKET
-    -->
-    
+   <li>
+    <p>Replace all U+005C REVERSE SOLIDUS (\) characters in <var
+     title="">url</var> with U+002F SOLIDUS (/) characters.
 
-   <li>Add a single U+0025 PERCENT SIGN character as a second alternative way
-    of matching the <pct-encoded> production, except when the
-    <pct-encoded> is used in the <reg-name> production.
+   <li>
+    <p>Parse <var title="">url</var> in the manner defined by RFC 3986, with
+     the following exceptions:</p>
 
-   <li>Add the U+0023 NUMBER SIGN character to the characters allowed in the
-    <fragment> production.</li>
-   <!-- some browsers also have other differences, e.g. Mozilla
-   seems to treat ";" as if it was not in sub-delims, if the scheem
-   is "ftp". -->
-  </ul>
+    <ul>
+     <li>Add all characters with codepoints less than or equal to U+0020 or
+      greater than or equal to U+007F to the <unreserved> production.
 
-  <p>If <var title="">url</var> doesn't match the <URI-reference>
-   production, even after the above changes are made to the ABNF definitions,
-   then parsing the URL fails with an error. <a
-   href="#refsRFC3986">[RFC3986]</a>
+     <li>Add the characters U+0022, U+003C, U+003E, U+005B .. U+005E, U+0060,
+      and U+007B .. U+007D to the <unreserved> production. <!--
+       0022 QUOTATION MARK
+       003C LESS-THAN SIGN
+       003E GREATER-THAN SIGN
+       005B LEFT SQUARE BRACKET
+       005C REVERSE SOLIDUS
+       005D RIGHT SQUARE BRACKET
+       005E CIRCUMFLEX ACCENT
+       0060 GRAVE ACCENT
+       007B LEFT CURLY BRACKET
+       007C VERTICAL LINE
+       007D RIGHT CURLY BRACKET
+      -->
+      
 
-  <p>If parsing <var title="">url</var> was successful, then the components
-   of the URL are substrings of <var title="">url</var> defined as follows:
+     <li>Add a single U+0025 PERCENT SIGN character as a second alternative
+      way of matching the <pct-encoded> production, except when the
+      <pct-encoded> is used in the <reg-name> production.
 
-  <dl>
-   <dt><dfn id=ltschemegt title=url-scheme><scheme></dfn>
+     <li>Add the U+0023 NUMBER SIGN character to the characters allowed in
+      the <fragment> production.</li>
+     <!-- some browsers also have other differences, e.g. Mozilla
+     seems to treat ";" as if it was not in sub-delims, if the scheem
+     is "ftp". -->
+    </ul>
 
-   <dd>
-    <p>The substring matched by the <scheme> production, if any.
+   <li>
+    <p>If <var title="">url</var> doesn't match the <URI-reference>
+     production, even after the above changes are made to the ABNF
+     definitions, then parsing the URL fails with an error. <a
+     href="#refsRFC3986">[RFC3986]</a></p>
 
-   <dt><dfn id=lthostgt title=url-host><host></dfn>
+    <p>Otherwise, parsing <var title="">url</var> was successful; the
+     components of the URL are substrings of <var title="">url</var> defined
+     as follows:</p>
 
-   <dd>
-    <p>The substring matched by the <host> production, if any.
+    <dl>
+     <dt><dfn id=ltschemegt title=url-scheme><scheme></dfn>
 
-   <dt><dfn id=ltportgt title=url-port><port></dfn>
+     <dd>
+      <p>The substring matched by the <scheme> production, if any.
 
-   <dd>
-    <p>The substring matched by the <port> production, if any.
+     <dt><dfn id=lthostgt title=url-host><host></dfn>
 
-   <dt><dfn id=lthostportgt title=url-hostport><hostport></dfn>
+     <dd>
+      <p>The substring matched by the <host> production, if any.
 
-   <dd>
-    <p>If there is a <scheme> component and a <port> component
-     and the port given by the <port> component is different than the
-     default port defined for the protocol given by the <scheme>
-     component, then <hostport> is the substring that starts with the
-     substring matched by the <host> production and ends with the
-     substring matched by the <port> production, and includes the colon
-     in between the two. Otherwise, it is the same as the <host>
-     component.</p>
+     <dt><dfn id=ltportgt title=url-port><port></dfn>
 
-   <dt><dfn id=ltpathgt title=url-path><path></dfn>
+     <dd>
+      <p>The substring matched by the <port> production, if any.
 
-   <dd>
-    <p>The substring matched by one of the following productions, if one of
-     them was matched:</p>
+     <dt><dfn id=lthostportgt title=url-hostport><hostport></dfn>
 
-    <ul class=brief>
-     <li><path-abempty>
+     <dd>
+      <p>If there is a <scheme> component and a <port> component
+       and the port given by the <port> component is different than the
+       default port defined for the protocol given by the <scheme>
+       component, then <hostport> is the substring that starts with the
+       substring matched by the <host> production and ends with the
+       substring matched by the <port> production, and includes the
+       colon in between the two. Otherwise, it is the same as the
+       <host> component.</p>
 
-     <li><path-absolute>
+     <dt><dfn id=ltpathgt title=url-path><path></dfn>
 
-     <li><path-noscheme>
+     <dd>
+      <p>The substring matched by one of the following productions, if one of
+       them was matched:</p>
 
-     <li><path-rootless>
+      <ul class=brief>
+       <li><path-abempty>
 
-     <li><path-empty>
-    </ul>
+       <li><path-absolute>
 
-   <dt><dfn id=ltquerygt title=url-query><query></dfn>
+       <li><path-noscheme>
 
-   <dd>
-    <p>The substring matched by the <query> production, if any.
+       <li><path-rootless>
 
-   <dt><dfn id=ltfragmentgt title=url-fragment><fragment></dfn>
+       <li><path-empty>
+      </ul>
 
-   <dd>
-    <p>The substring matched by the <fragment> production, if any.
-  </dl>
+     <dt><dfn id=ltquerygt title=url-query><query></dfn>
 
+     <dd>
+      <p>The substring matched by the <query> production, if any.
+
+     <dt><dfn id=ltfragmentgt title=url-fragment><fragment></dfn>
+
+     <dd>
+      <p>The substring matched by the <fragment> production, if any.
+    </dl>
+  </ol>
+
   <h4 id=resolving><span class=secno>2.3.3 </span>Resolving URLs</h4>
 
   <p>Relative URLs are resolved relative to a base URL. The <dfn

Modified: source
===================================================================
--- source	2008-06-27 21:57:53 UTC (rev 1822)
+++ source	2008-06-27 23:03:53 UTC (rev 1823)
@@ -967,118 +967,136 @@
   <h4>Parsing URLs</h4>
 
   <p>To <dfn>parse a URL</dfn> <var title="">url</var> into its
-  component parts, the user agent must first strip leading and
-  trailing <span title="space character">space characters</span> from
-  <var title="">url</var>, and then must parse <var title="">url</var>
-  in the manner defined by RFC 3986, with the following
-  exceptions:</p>
+  component parts, the user agent must use the following steps:</p>
 
-  <ul>
+  <ol>
 
-   <li>Add all characters with codepoints less than or equal to
-   U+0020 or greater than or equal to U+007F to the
-   <unreserved> production.</li>
+   <li><p>Strip leading and trailing <span title="space
+   character">space characters</span> from <var
+   title="">url</var>.</p></li>
 
-   <li>Add the characters U+0022, U+003C, U+003E, U+005B .. U+005E,
-   U+0060, and U+007B .. U+007D to the <unreserved>
-   production.
-    <!--
-     0022 QUOTATION MARK
-     003C LESS-THAN SIGN
-     003E GREATER-THAN SIGN
-     005B LEFT SQUARE BRACKET
-     005C REVERSE SOLIDUS
-     005D RIGHT SQUARE BRACKET
-     005E CIRCUMFLEX ACCENT
-     0060 GRAVE ACCENT
-     007B LEFT CURLY BRACKET
-     007C VERTICAL LINE
-     007D RIGHT CURLY BRACKET
-    -->
+   <li><p>Replace all U+005C REVERSE SOLIDUS (\) characters in <var
+   title="">url</var> with U+002F SOLIDUS (/) characters.</p></li>
+
+   <li>
+
+    <p>Parse <var title="">url</var> in the manner defined by RFC
+    3986, with the following exceptions:</p>
+
+    <ul>
+
+     <li>Add all characters with codepoints less than or equal to
+     U+0020 or greater than or equal to U+007F to the
+     <unreserved> production.</li>
+
+     <li>Add the characters U+0022, U+003C, U+003E, U+005B .. U+005E,
+     U+0060, and U+007B .. U+007D to the <unreserved>
+     production.
+      <!--
+       0022 QUOTATION MARK
+       003C LESS-THAN SIGN
+       003E GREATER-THAN SIGN
+       005B LEFT SQUARE BRACKET
+       005C REVERSE SOLIDUS
+       005D RIGHT SQUARE BRACKET
+       005E CIRCUMFLEX ACCENT
+       0060 GRAVE ACCENT
+       007B LEFT CURLY BRACKET
+       007C VERTICAL LINE
+       007D RIGHT CURLY BRACKET
+      -->
+     </li>
+
+     <li>Add a single U+0025 PERCENT SIGN character as a second
+     alternative way of matching the <pct-encoded> production,
+     except when the <pct-encoded> is used in the
+     <reg-name> production.</li>
+
+     <li>Add the U+0023 NUMBER SIGN character to the characters
+     allowed in the <fragment> production.</li>
+
+     <!-- some browsers also have other differences, e.g. Mozilla
+     seems to treat ";" as if it was not in sub-delims, if the scheem
+     is "ftp". -->
+
+    </ul>
+
    </li>
 
-   <li>Add a single U+0025 PERCENT SIGN character as a second
-   alternative way of matching the <pct-encoded> production,
-   except when the <pct-encoded> is used in the
-   <reg-name> production.</li>
+   <li>
 
-   <li>Add the U+0023 NUMBER SIGN character to the characters
-   allowed in the <fragment> production.</li>
+    <p>If <var title="">url</var> doesn't match the
+    <URI-reference> production, even after the above changes are
+    made to the ABNF definitions, then parsing the URL fails with an
+    error. <a href="#refsRFC3986">[RFC3986]</a></p>
 
-   <!-- some browsers also have other differences, e.g. Mozilla
-   seems to treat ";" as if it was not in sub-delims, if the scheem
-   is "ftp". -->
+    <p>Otherwise, parsing <var title="">url</var> was successful; the
+    components of the URL are substrings of <var title="">url</var>
+    defined as follows:</p>
 
-  </ul>
+    <dl>
 
-  <p>If <var title="">url</var> doesn't match the
-  <URI-reference> production, even after the above changes are
-  made to the ABNF definitions, then parsing the URL fails with an
-  error. <a href="#refsRFC3986">[RFC3986]</a></p>
+     <dt><dfn title="url-scheme"><scheme></dfn></dt>
 
-  <p>If parsing <var title="">url</var> was successful, then the
-  components of the URL are substrings of <var title="">url</var>
-  defined as follows:</p>
+     <dd><p>The substring matched by the <scheme> production, if any.</p></dd>
 
-  <dl>
 
-   <dt><dfn title="url-scheme"><scheme></dfn></dt>
+     <dt><dfn title="url-host"><host></dfn></dt>
 
-   <dd><p>The substring matched by the <scheme> production, if any.</p></dd>
+     <dd><p>The substring matched by the <host> production, if any.</p></dd>
 
 
-   <dt><dfn title="url-host"><host></dfn></dt>
+     <dt><dfn title="url-port"><port></dfn></dt>
 
-   <dd><p>The substring matched by the <host> production, if any.</p></dd>
+     <dd><p>The substring matched by the <port> production, if any.</p></dd>
 
 
-   <dt><dfn title="url-port"><port></dfn></dt>
+     <dt><dfn title="url-hostport"><hostport></dfn></dt>
 
-   <dd><p>The substring matched by the <port> production, if any.</p></dd>
+     <dd><p>If there is a <scheme> component and a <port>
+     component and the port given by the <port> component is
+     different than the default port defined for the protocol given by
+     the <scheme> component, then <hostport> is the
+     substring that starts with the substring matched by the
+     <host> production and ends with the substring matched by the
+     <port> production, and includes the colon in between the
+     two. Otherwise, it is the same as the <host> component.</p>
 
 
-   <dt><dfn title="url-hostport"><hostport></dfn></dt>
+     <dt><dfn title="url-path"><path></dfn></dt>
 
-   <dd><p>If there is a <scheme> component and a <port>
-   component and the port given by the <port> component is
-   different than the default port defined for the protocol given by
-   the <scheme> component, then <hostport> is the
-   substring that starts with the substring matched by the
-   <host> production and ends with the substring matched by the
-   <port> production, and includes the colon in between the
-   two. Otherwise, it is the same as the <host> component.</p>
+     <dd>
 
+      <p>The substring matched by one of the following productions, if
+      one of them was matched:</p>
 
-   <dt><dfn title="url-path"><path></dfn></dt>
+      <ul class="brief">
+       <li><path-abempty></li>
+       <li><path-absolute></li>
+       <li><path-noscheme></li>
+       <li><path-rootless></li>
+       <li><path-empty></li>
+      </ul>
 
-   <dd>
+     </dd>
 
-    <p>The substring matched by one of the following productions, if
-    one of them was matched:</p>
 
-    <ul class="brief">
-     <li><path-abempty></li>
-     <li><path-absolute></li>
-     <li><path-noscheme></li>
-     <li><path-rootless></li>
-     <li><path-empty></li>
-    </ul>
+     <dt><dfn title="url-query"><query></dfn></dt>
 
-   </dd>
+     <dd><p>The substring matched by the <query> production, if any.</p></dd>
 
 
-   <dt><dfn title="url-query"><query></dfn></dt>
+     <dt><dfn title="url-fragment"><fragment></dfn></dt>
 
-   <dd><p>The substring matched by the <query> production, if any.</p></dd>
+     <dd><p>The substring matched by the <fragment> production, if any.</p></dd>
 
+    </dl>
 
-   <dt><dfn title="url-fragment"><fragment></dfn></dt>
+   </li>
 
-   <dd><p>The substring matched by the <fragment> production, if any.</p></dd>
+  </ol>
 
-  </dl>
 
-
   <h4>Resolving URLs</h4>
 
   <p>Relative URLs are resolved relative to a base URL. The <dfn>base




More information about the Commit-Watchers mailing list