[html5] r1823 - [] (0) Make backslashes turn into forward slashes when parsing URLs. Sigh.
whatwg at whatwg.org
whatwg at whatwg.org
Fri Jun 27 16:03:53 PDT 2008
Author: ianh
Date: 2008-06-27 16:03:53 -0700 (Fri, 27 Jun 2008)
New Revision: 1823
Modified:
index
source
Log:
[] (0) Make backslashes turn into forward slashes when parsing URLs. Sigh.
Modified: index
===================================================================
--- index 2008-06-27 21:57:53 UTC (rev 1822)
+++ index 2008-06-27 23:03:53 UTC (rev 1823)
@@ -2722,107 +2722,120 @@
<h4 id=parsing0><span class=secno>2.3.2 </span>Parsing URLs</h4>
<p>To <dfn id=parse0>parse a URL</dfn> <var title="">url</var> into its
- component parts, the user agent must first strip leading and trailing <a
- href="#space" title="space character">space characters</a> from <var
- title="">url</var>, and then must parse <var title="">url</var> in the
- manner defined by RFC 3986, with the following exceptions:
+ component parts, the user agent must use the following steps:
- <ul>
- <li>Add all characters with codepoints less than or equal to U+0020 or
- greater than or equal to U+007F to the <unreserved> production.
+ <ol>
+ <li>
+ <p>Strip leading and trailing <a href="#space" title="space
+ character">space characters</a> from <var title="">url</var>.
- <li>Add the characters U+0022, U+003C, U+003E, U+005B .. U+005E, U+0060,
- and U+007B .. U+007D to the <unreserved> production. <!--
- 0022 QUOTATION MARK
- 003C LESS-THAN SIGN
- 003E GREATER-THAN SIGN
- 005B LEFT SQUARE BRACKET
- 005C REVERSE SOLIDUS
- 005D RIGHT SQUARE BRACKET
- 005E CIRCUMFLEX ACCENT
- 0060 GRAVE ACCENT
- 007B LEFT CURLY BRACKET
- 007C VERTICAL LINE
- 007D RIGHT CURLY BRACKET
- -->
-
+ <li>
+ <p>Replace all U+005C REVERSE SOLIDUS (\) characters in <var
+ title="">url</var> with U+002F SOLIDUS (/) characters.
- <li>Add a single U+0025 PERCENT SIGN character as a second alternative way
- of matching the <pct-encoded> production, except when the
- <pct-encoded> is used in the <reg-name> production.
+ <li>
+ <p>Parse <var title="">url</var> in the manner defined by RFC 3986, with
+ the following exceptions:</p>
- <li>Add the U+0023 NUMBER SIGN character to the characters allowed in the
- <fragment> production.</li>
- <!-- some browsers also have other differences, e.g. Mozilla
- seems to treat ";" as if it was not in sub-delims, if the scheem
- is "ftp". -->
- </ul>
+ <ul>
+ <li>Add all characters with codepoints less than or equal to U+0020 or
+ greater than or equal to U+007F to the <unreserved> production.
- <p>If <var title="">url</var> doesn't match the <URI-reference>
- production, even after the above changes are made to the ABNF definitions,
- then parsing the URL fails with an error. <a
- href="#refsRFC3986">[RFC3986]</a>
+ <li>Add the characters U+0022, U+003C, U+003E, U+005B .. U+005E, U+0060,
+ and U+007B .. U+007D to the <unreserved> production. <!--
+ 0022 QUOTATION MARK
+ 003C LESS-THAN SIGN
+ 003E GREATER-THAN SIGN
+ 005B LEFT SQUARE BRACKET
+ 005C REVERSE SOLIDUS
+ 005D RIGHT SQUARE BRACKET
+ 005E CIRCUMFLEX ACCENT
+ 0060 GRAVE ACCENT
+ 007B LEFT CURLY BRACKET
+ 007C VERTICAL LINE
+ 007D RIGHT CURLY BRACKET
+ -->
+
- <p>If parsing <var title="">url</var> was successful, then the components
- of the URL are substrings of <var title="">url</var> defined as follows:
+ <li>Add a single U+0025 PERCENT SIGN character as a second alternative
+ way of matching the <pct-encoded> production, except when the
+ <pct-encoded> is used in the <reg-name> production.
- <dl>
- <dt><dfn id=ltschemegt title=url-scheme><scheme></dfn>
+ <li>Add the U+0023 NUMBER SIGN character to the characters allowed in
+ the <fragment> production.</li>
+ <!-- some browsers also have other differences, e.g. Mozilla
+ seems to treat ";" as if it was not in sub-delims, if the scheem
+ is "ftp". -->
+ </ul>
- <dd>
- <p>The substring matched by the <scheme> production, if any.
+ <li>
+ <p>If <var title="">url</var> doesn't match the <URI-reference>
+ production, even after the above changes are made to the ABNF
+ definitions, then parsing the URL fails with an error. <a
+ href="#refsRFC3986">[RFC3986]</a></p>
- <dt><dfn id=lthostgt title=url-host><host></dfn>
+ <p>Otherwise, parsing <var title="">url</var> was successful; the
+ components of the URL are substrings of <var title="">url</var> defined
+ as follows:</p>
- <dd>
- <p>The substring matched by the <host> production, if any.
+ <dl>
+ <dt><dfn id=ltschemegt title=url-scheme><scheme></dfn>
- <dt><dfn id=ltportgt title=url-port><port></dfn>
+ <dd>
+ <p>The substring matched by the <scheme> production, if any.
- <dd>
- <p>The substring matched by the <port> production, if any.
+ <dt><dfn id=lthostgt title=url-host><host></dfn>
- <dt><dfn id=lthostportgt title=url-hostport><hostport></dfn>
+ <dd>
+ <p>The substring matched by the <host> production, if any.
- <dd>
- <p>If there is a <scheme> component and a <port> component
- and the port given by the <port> component is different than the
- default port defined for the protocol given by the <scheme>
- component, then <hostport> is the substring that starts with the
- substring matched by the <host> production and ends with the
- substring matched by the <port> production, and includes the colon
- in between the two. Otherwise, it is the same as the <host>
- component.</p>
+ <dt><dfn id=ltportgt title=url-port><port></dfn>
- <dt><dfn id=ltpathgt title=url-path><path></dfn>
+ <dd>
+ <p>The substring matched by the <port> production, if any.
- <dd>
- <p>The substring matched by one of the following productions, if one of
- them was matched:</p>
+ <dt><dfn id=lthostportgt title=url-hostport><hostport></dfn>
- <ul class=brief>
- <li><path-abempty>
+ <dd>
+ <p>If there is a <scheme> component and a <port> component
+ and the port given by the <port> component is different than the
+ default port defined for the protocol given by the <scheme>
+ component, then <hostport> is the substring that starts with the
+ substring matched by the <host> production and ends with the
+ substring matched by the <port> production, and includes the
+ colon in between the two. Otherwise, it is the same as the
+ <host> component.</p>
- <li><path-absolute>
+ <dt><dfn id=ltpathgt title=url-path><path></dfn>
- <li><path-noscheme>
+ <dd>
+ <p>The substring matched by one of the following productions, if one of
+ them was matched:</p>
- <li><path-rootless>
+ <ul class=brief>
+ <li><path-abempty>
- <li><path-empty>
- </ul>
+ <li><path-absolute>
- <dt><dfn id=ltquerygt title=url-query><query></dfn>
+ <li><path-noscheme>
- <dd>
- <p>The substring matched by the <query> production, if any.
+ <li><path-rootless>
- <dt><dfn id=ltfragmentgt title=url-fragment><fragment></dfn>
+ <li><path-empty>
+ </ul>
- <dd>
- <p>The substring matched by the <fragment> production, if any.
- </dl>
+ <dt><dfn id=ltquerygt title=url-query><query></dfn>
+ <dd>
+ <p>The substring matched by the <query> production, if any.
+
+ <dt><dfn id=ltfragmentgt title=url-fragment><fragment></dfn>
+
+ <dd>
+ <p>The substring matched by the <fragment> production, if any.
+ </dl>
+ </ol>
+
<h4 id=resolving><span class=secno>2.3.3 </span>Resolving URLs</h4>
<p>Relative URLs are resolved relative to a base URL. The <dfn
Modified: source
===================================================================
--- source 2008-06-27 21:57:53 UTC (rev 1822)
+++ source 2008-06-27 23:03:53 UTC (rev 1823)
@@ -967,118 +967,136 @@
<h4>Parsing URLs</h4>
<p>To <dfn>parse a URL</dfn> <var title="">url</var> into its
- component parts, the user agent must first strip leading and
- trailing <span title="space character">space characters</span> from
- <var title="">url</var>, and then must parse <var title="">url</var>
- in the manner defined by RFC 3986, with the following
- exceptions:</p>
+ component parts, the user agent must use the following steps:</p>
- <ul>
+ <ol>
- <li>Add all characters with codepoints less than or equal to
- U+0020 or greater than or equal to U+007F to the
- <unreserved> production.</li>
+ <li><p>Strip leading and trailing <span title="space
+ character">space characters</span> from <var
+ title="">url</var>.</p></li>
- <li>Add the characters U+0022, U+003C, U+003E, U+005B .. U+005E,
- U+0060, and U+007B .. U+007D to the <unreserved>
- production.
- <!--
- 0022 QUOTATION MARK
- 003C LESS-THAN SIGN
- 003E GREATER-THAN SIGN
- 005B LEFT SQUARE BRACKET
- 005C REVERSE SOLIDUS
- 005D RIGHT SQUARE BRACKET
- 005E CIRCUMFLEX ACCENT
- 0060 GRAVE ACCENT
- 007B LEFT CURLY BRACKET
- 007C VERTICAL LINE
- 007D RIGHT CURLY BRACKET
- -->
+ <li><p>Replace all U+005C REVERSE SOLIDUS (\) characters in <var
+ title="">url</var> with U+002F SOLIDUS (/) characters.</p></li>
+
+ <li>
+
+ <p>Parse <var title="">url</var> in the manner defined by RFC
+ 3986, with the following exceptions:</p>
+
+ <ul>
+
+ <li>Add all characters with codepoints less than or equal to
+ U+0020 or greater than or equal to U+007F to the
+ <unreserved> production.</li>
+
+ <li>Add the characters U+0022, U+003C, U+003E, U+005B .. U+005E,
+ U+0060, and U+007B .. U+007D to the <unreserved>
+ production.
+ <!--
+ 0022 QUOTATION MARK
+ 003C LESS-THAN SIGN
+ 003E GREATER-THAN SIGN
+ 005B LEFT SQUARE BRACKET
+ 005C REVERSE SOLIDUS
+ 005D RIGHT SQUARE BRACKET
+ 005E CIRCUMFLEX ACCENT
+ 0060 GRAVE ACCENT
+ 007B LEFT CURLY BRACKET
+ 007C VERTICAL LINE
+ 007D RIGHT CURLY BRACKET
+ -->
+ </li>
+
+ <li>Add a single U+0025 PERCENT SIGN character as a second
+ alternative way of matching the <pct-encoded> production,
+ except when the <pct-encoded> is used in the
+ <reg-name> production.</li>
+
+ <li>Add the U+0023 NUMBER SIGN character to the characters
+ allowed in the <fragment> production.</li>
+
+ <!-- some browsers also have other differences, e.g. Mozilla
+ seems to treat ";" as if it was not in sub-delims, if the scheem
+ is "ftp". -->
+
+ </ul>
+
</li>
- <li>Add a single U+0025 PERCENT SIGN character as a second
- alternative way of matching the <pct-encoded> production,
- except when the <pct-encoded> is used in the
- <reg-name> production.</li>
+ <li>
- <li>Add the U+0023 NUMBER SIGN character to the characters
- allowed in the <fragment> production.</li>
+ <p>If <var title="">url</var> doesn't match the
+ <URI-reference> production, even after the above changes are
+ made to the ABNF definitions, then parsing the URL fails with an
+ error. <a href="#refsRFC3986">[RFC3986]</a></p>
- <!-- some browsers also have other differences, e.g. Mozilla
- seems to treat ";" as if it was not in sub-delims, if the scheem
- is "ftp". -->
+ <p>Otherwise, parsing <var title="">url</var> was successful; the
+ components of the URL are substrings of <var title="">url</var>
+ defined as follows:</p>
- </ul>
+ <dl>
- <p>If <var title="">url</var> doesn't match the
- <URI-reference> production, even after the above changes are
- made to the ABNF definitions, then parsing the URL fails with an
- error. <a href="#refsRFC3986">[RFC3986]</a></p>
+ <dt><dfn title="url-scheme"><scheme></dfn></dt>
- <p>If parsing <var title="">url</var> was successful, then the
- components of the URL are substrings of <var title="">url</var>
- defined as follows:</p>
+ <dd><p>The substring matched by the <scheme> production, if any.</p></dd>
- <dl>
- <dt><dfn title="url-scheme"><scheme></dfn></dt>
+ <dt><dfn title="url-host"><host></dfn></dt>
- <dd><p>The substring matched by the <scheme> production, if any.</p></dd>
+ <dd><p>The substring matched by the <host> production, if any.</p></dd>
- <dt><dfn title="url-host"><host></dfn></dt>
+ <dt><dfn title="url-port"><port></dfn></dt>
- <dd><p>The substring matched by the <host> production, if any.</p></dd>
+ <dd><p>The substring matched by the <port> production, if any.</p></dd>
- <dt><dfn title="url-port"><port></dfn></dt>
+ <dt><dfn title="url-hostport"><hostport></dfn></dt>
- <dd><p>The substring matched by the <port> production, if any.</p></dd>
+ <dd><p>If there is a <scheme> component and a <port>
+ component and the port given by the <port> component is
+ different than the default port defined for the protocol given by
+ the <scheme> component, then <hostport> is the
+ substring that starts with the substring matched by the
+ <host> production and ends with the substring matched by the
+ <port> production, and includes the colon in between the
+ two. Otherwise, it is the same as the <host> component.</p>
- <dt><dfn title="url-hostport"><hostport></dfn></dt>
+ <dt><dfn title="url-path"><path></dfn></dt>
- <dd><p>If there is a <scheme> component and a <port>
- component and the port given by the <port> component is
- different than the default port defined for the protocol given by
- the <scheme> component, then <hostport> is the
- substring that starts with the substring matched by the
- <host> production and ends with the substring matched by the
- <port> production, and includes the colon in between the
- two. Otherwise, it is the same as the <host> component.</p>
+ <dd>
+ <p>The substring matched by one of the following productions, if
+ one of them was matched:</p>
- <dt><dfn title="url-path"><path></dfn></dt>
+ <ul class="brief">
+ <li><path-abempty></li>
+ <li><path-absolute></li>
+ <li><path-noscheme></li>
+ <li><path-rootless></li>
+ <li><path-empty></li>
+ </ul>
- <dd>
+ </dd>
- <p>The substring matched by one of the following productions, if
- one of them was matched:</p>
- <ul class="brief">
- <li><path-abempty></li>
- <li><path-absolute></li>
- <li><path-noscheme></li>
- <li><path-rootless></li>
- <li><path-empty></li>
- </ul>
+ <dt><dfn title="url-query"><query></dfn></dt>
- </dd>
+ <dd><p>The substring matched by the <query> production, if any.</p></dd>
- <dt><dfn title="url-query"><query></dfn></dt>
+ <dt><dfn title="url-fragment"><fragment></dfn></dt>
- <dd><p>The substring matched by the <query> production, if any.</p></dd>
+ <dd><p>The substring matched by the <fragment> production, if any.</p></dd>
+ </dl>
- <dt><dfn title="url-fragment"><fragment></dfn></dt>
+ </li>
- <dd><p>The substring matched by the <fragment> production, if any.</p></dd>
+ </ol>
- </dl>
-
<h4>Resolving URLs</h4>
<p>Relative URLs are resolved relative to a base URL. The <dfn>base
More information about the Commit-Watchers
mailing list