[html5] r3245 - [e] (0) Strip the URLs section out now that DanC is editing the Web Addresses draft.

whatwg at whatwg.org whatwg at whatwg.org
Sat Jun 13 17:21:05 PDT 2009


Author: ianh
Date: 2009-06-13 17:21:04 -0700 (Sat, 13 Jun 2009)
New Revision: 3245

Modified:
   index
   source
Log:
[e] (0) Strip the URLs section out now that DanC is editing the Web Addresses draft.

Modified: index
===================================================================
--- index	2009-06-13 23:51:32 UTC (rev 3244)
+++ index	2009-06-14 00:21:04 UTC (rev 3245)
@@ -39,7 +39,7 @@
   <div class=head>
    <p><a class=logo href=http://www.whatwg.org/ rel=home><img alt=WHATWG src=/images/logo></a></p>
    <h1>HTML 5</h1>
-   <h2 class="no-num no-toc" id=draft-standard-—-date:-01-jan-1901>Draft Standard — 13 June 2009</h2>
+   <h2 class="no-num no-toc" id=draft-standard-—-date:-01-jan-1901>Draft Standard — 14 June 2009</h2>
    <p>You can take part in this work. <a href=http://www.whatwg.org/mailing-list>Join the working group's discussion list.</a></p>
    <p><strong>Web designers!</strong> We have a <a href=http://blog.whatwg.org/faq/>FAQ</a>, a <a href=http://forums.whatwg.org/>forum</a>, and a <a href=http://www.whatwg.org/mailing-list#help>help mailing list</a> for you!</p>
    <dl><dt>Multiple-page version:</dt>
@@ -261,10 +261,8 @@
    <li><a href=#urls><span class=secno>2.5 </span>URLs</a>
     <ol>
      <li><a href=#terminology-0><span class=secno>2.5.1 </span>Terminology</a></li>
-     <li><a href=#parsing-urls><span class=secno>2.5.2 </span>Parsing URLs</a></li>
-     <li><a href=#resolving-urls><span class=secno>2.5.3 </span>Resolving URLs</a></li>
-     <li><a href=#dynamic-changes-to-base-urls><span class=secno>2.5.4 </span>Dynamic changes to base URLs</a></li>
-     <li><a href=#interfaces-for-url-manipulation><span class=secno>2.5.5 </span>Interfaces for URL manipulation</a></ol></li>
+     <li><a href=#dynamic-changes-to-base-urls><span class=secno>2.5.2 </span>Dynamic changes to base URLs</a></li>
+     <li><a href=#interfaces-for-url-manipulation><span class=secno>2.5.3 </span>Interfaces for URL manipulation</a></ol></li>
    <li><a href=#fetching-resources><span class=secno>2.6 </span>Fetching resources</a>
     <ol>
      <li><a href=#concept-http-equivalent><span class=secno>2.6.1 </span>Protocol concepts</a></li>
@@ -4381,387 +4379,59 @@
 
   <h3 id=urls><span class=secno>2.5 </span>URLs</h3>
 
-  <p>This specification defines the term <a href=#url>URL</a>, and defines
-  various algorithms for dealing with URLs, because for historical
-  reasons the rules defined by the URI and IRI specifications are not
-  a complete description of what HTML user agents need to implement to
-  be compatible with Web content.</p>
-
-
   <h4 id=terminology-0><span class=secno>2.5.1 </span>Terminology</h4>
 
   <p>A <dfn id=url>URL</dfn> is a string used to identify a resource.</p>
 
-  <p>A <a href=#url>URL</a> is a <dfn id=valid-url>valid URL</dfn> if at least one of
-  the following conditions holds:</p>
+  <p>A <a href=#url>URL</a> is a <dfn id=valid-url>valid URL</dfn> if it is a
+  <span>valid Web address</span> as defined by the Web addresses
+  specification. <a href=#refsWEBADDRESSES>[WEBADDRESSES]</a></p>
 
-  <ul><li><p>The <a href=#url>URL</a> is a valid URI reference <a href=#refsRFC3986>[RFC3986]</a>.</li>
+  <p>A <a href=#url>URL</a> is an <dfn id=absolute-url>absolute URL</dfn> if it is an
+  <span>absolute Web address</span> as defined by the Web addresses
+  specification. <a href=#refsWEBADDRESSES>[WEBADDRESSES]</a></p>
 
-   <li><p>The <a href=#url>URL</a> is a valid IRI reference and it has no
-   query component. <a href=#refsRFC3987>[RFC3987]</a></li>
+  <div class=impl>
 
-   <li><p>The <a href=#url>URL</a> is a valid IRI reference and its query
-   component contains no unescaped non-ASCII characters. <a href=#refsRFC3987>[RFC3987]</a></li>
+  <p>To <dfn id=parse-a-url>parse a URL</dfn> <var title="">url</var> into its
+  component parts, the user agent must use the <span>parse a Web
+  address</span> algorithm defined by the Web addresses
+  specification. <a href=#refsWEBADDRESSES>[WEBADDRESSES]</a></p>
 
-   <li><p>The <a href=#url>URL</a> is a valid IRI reference and the <a href="#document's-character-encoding" title="document's character encoding">character encoding</a> of
-   the URL's <code>Document</code> is UTF-8 or UTF-16. <a href=#refsRFC3987>[RFC3987]</a></li>
+  <p>Parsing a URL results in the following components, again as
+  defined by the Web addresses specification:</p>
 
-  </ul><div class=impl>
+  <ul class=brief><li><dfn id=url-scheme title=url-scheme><scheme></dfn></li>
+   <li><dfn id=url-host title=url-host><host></dfn></li>
+   <li><dfn id=url-port title=url-port><port></dfn></li>
+   <li><dfn id=url-hostport title=url-hostport><hostport></dfn></li>
+   <li><dfn id=url-path title=url-path><path></dfn></li>
+   <li><dfn id=url-query title=url-query><query></dfn></li>
+   <li><dfn id=url-fragment title=url-fragment><fragment></dfn></li>
+   <li><dfn id=url-host-specific title=url-host-specific><host-specific></dfn></li>
+  </ul><p>To <dfn id=resolve-a-url>resolve a URL</dfn> to an <a href=#absolute-url>absolute URL</a>
+  relative to either another <a href=#absolute-url>absolute URL</a> or an element,
+  the user agent must use the <span>resolve a Web address</span>
+  algorithm defined by the Web addresses specification. <a href=#refsWEBADDRESSES>[WEBADDRESSES]</a></p>
 
-  <p>A <a href=#url>URL</a> has an associated <dfn id=url-character-encoding>URL character
-  encoding</dfn>, determined as follows:</p>
+  <p>The <dfn id=document-base-url>document base URL</dfn> of a <code>Document</code>
+  object is the <span>document base Web address</span> as defined by
+  the Web addresses specification. <a href=#refsWEBADDRESSES>[WEBADDRESSES]</a></p>
 
-  <dl class=switch><dt>If the URL came from a script (e.g. as an argument to a
-   method)</dt>
+  </div>
 
-   <dd>The URL character encoding is the <a href="#script's-url-character-encoding">script's URL character
-   encoding</a>.</dd>
-
-   <dt>If the URL came from a DOM node (e.g. from an element)</dt>
-
-   <dd>The node has a <code>Document</code>, and the URL character
-   encoding is the <a href="#document's-character-encoding">document's character encoding</a>.</dd>
-
-   <dt>If the URL had a character encoding defined when the URL was
-   created or defined</dt>
-
-   <dd>The URL character encoding is as defined.</dd>
-
-  </dl><p class=note>The term "URL" in this specification is used in a
+  <p class=note>The term "URL" in this specification is used in a
   manner distinct from the precise technical meaning it is given in
   RFC 3986. Readers familiar with that RFC will find it easier to read
   <em>this</em> specification if they pretend the term "URL" as used
   herein is really called something else altogether. This is a
   <a href=#willful-violation>willful violation</a> of RFC 3986. <a href=#refsRFC3986>[RFC3986]</a></p>
 
-  </div>
 
-
   <div class=impl>
 
-  <h4 id=parsing-urls><span class=secno>2.5.2 </span>Parsing URLs</h4>
+  <h4 id=dynamic-changes-to-base-urls><span class=secno>2.5.2 </span>Dynamic changes to base URLs</h4>
 
-  <p>To <dfn id=parse-a-url>parse a URL</dfn> <var title="">url</var> into its
-  component parts, the user agent must use the following steps:</p>
-
-  <ol><li><p>Strip leading and trailing <a href=#space-character title="space
-   character">space characters</a> from <var title="">url</var>.</li>
-
-   <li>
-
-    <p>Parse <var title="">url</var> in the manner defined by RFC
-    3986, with the following exceptions:</p>
-
-    <ul><li>Add all characters with code points less than or equal to
-     U+0020 or greater than or equal to U+007F to the
-     <unreserved> production.</li>
-
-     <li>Add the characters U+0022, U+003C, U+003E, U+005B .. U+005E,
-     U+0060, and U+007B .. U+007D to the <unreserved>
-     production.
-      <!--
-       0022 QUOTATION MARK
-       003C LESS-THAN SIGN
-       003E GREATER-THAN SIGN
-       005B LEFT SQUARE BRACKET
-       005C REVERSE SOLIDUS
-       005D RIGHT SQUARE BRACKET
-       005E CIRCUMFLEX ACCENT
-       0060 GRAVE ACCENT
-       007B LEFT CURLY BRACKET
-       007C VERTICAL LINE
-       007D RIGHT CURLY BRACKET
-      -->
-     </li>
-
-     <li>Add a single U+0025 PERCENT SIGN character as a second
-     alternative way of matching the <pct-encoded> production,
-     except when the <pct-encoded> is used in the
-     <reg-name> production.</li>
-
-     <li>Add the U+0023 NUMBER SIGN character to the characters
-     allowed in the <fragment> production.</li>
-
-     <!-- some browsers also have other differences, e.g. Mozilla
-     seems to treat ";" as if it was not in sub-delims, if the scheem
-     is "ftp". -->
-
-    </ul></li>
-
-   <li>
-
-    <p>If <var title="">url</var> doesn't match the
-    <URI-reference> production, even after the above changes are
-    made to the ABNF definitions, then parsing the URL fails with an
-    error. <a href=#refsRFC3986>[RFC3986]</a></p>
-
-    <p>Otherwise, parsing <var title="">url</var> was successful; the
-    components of the URL are substrings of <var title="">url</var>
-    defined as follows:</p>
-
-    <dl><dt><dfn id=url-scheme title=url-scheme><scheme></dfn></dt>
-
-     <dd><p>The substring matched by the <scheme> production, if any.</dd>
-
-
-     <dt><dfn id=url-host title=url-host><host></dfn></dt>
-
-     <dd><p>The substring matched by the <host> production, if any.</dd>
-
-
-     <dt><dfn id=url-port title=url-port><port></dfn></dt>
-
-     <dd><p>The substring matched by the <port> production, if any.</dd>
-
-
-     <dt><dfn id=url-hostport title=url-hostport><hostport></dfn></dt>
-
-     <dd><p>If there is a <scheme> component and a <port>
-     component and the port given by the <port> component is
-     different than the default port defined for the protocol given by
-     the <scheme> component, then <hostport> is the
-     substring that starts with the substring matched by the
-     <host> production and ends with the substring matched by the
-     <port> production, and includes the colon in between the
-     two. Otherwise, it is the same as the <host> component.</p>
-
-
-     <dt><dfn id=url-path title=url-path><path></dfn></dt>
-
-     <dd>
-
-      <p>The substring matched by one of the following productions, if
-      one of them was matched:</p>
-
-      <ul class=brief><li><path-abempty></li>
-       <li><path-absolute></li>
-       <li><path-noscheme></li>
-       <li><path-rootless></li>
-       <li><path-empty></li>
-      </ul></dd>
-
-
-     <dt><dfn id=url-query title=url-query><query></dfn></dt>
-
-     <dd><p>The substring matched by the <query> production, if any.</dd>
-
-
-     <dt><dfn id=url-fragment title=url-fragment><fragment></dfn></dt>
-
-     <dd><p>The substring matched by the <fragment> production, if any.</dd>
-
-
-     <dt><dfn id=url-host-specific title=url-host-specific><host-specific></dfn></dt>
-
-     <dd><p>The substring that <em>follows</em> the substring matched
-     by the <authority> production, or the whole string if the
-     <authority> production wasn't matched.</dd>
-
-    </dl></li>
-
-  </ol><p class=note>These parsing rules are a <a href=#willful-violation>willful
-  violation</a> of RFC 3986 and RFC 3987 (which do not define error
-  handling), motivated by a desire to handle legacy content. <a href=#refsRFC3986>[RFC3986]</a> <a href=#refsRFC3987>[RFC3987]</a></p>
-
-  </div>
-
-
-
-  <div class=impl>
-
-  <h4 id=resolving-urls><span class=secno>2.5.3 </span>Resolving URLs</h4>
-
-  <p>To <dfn id=resolve-a-url>resolve a URL</dfn> to an <a href=#absolute-url>absolute URL</a>
-  relative to either another <a href=#absolute-url>absolute URL</a> or an element,
-  the user agent must use the following steps. Resolving a URL can
-  result in an error, in which case the URL is not resolvable.</p>
-
-  <ol><li><p>Let <var title="">url</var> be the <a href=#url>URL</a> being
-   resolved.</li>
-
-   <li><p>Let <var title="">encoding</var> be the <a href=#url-character-encoding>URL character
-   encoding</a>.</li>
-
-   <li><p>If <var title="">encoding</var> is a UTF-16 encoding, then
-   change the value of <var title="">encoding</var> to UTF-8.</li>
-
-   <li>
-
-    <p>If the algorithm was invoked with an <a href=#absolute-url>absolute URL</a>
-    to use as the base URL, let <var title="">base</var> be that
-    <a href=#absolute-url>absolute URL</a>.</p>
-
-    <p>Otherwise, let <var title="">base</var> be the <i>base URI of
-    the element</i>, as defined by the XML Base specification, with
-    <i>the base URI of the document entity</i> being defined as the
-    <a href=#document-base-url>document base URL</a> of the <code>Document</code> that
-    owns the element. <a href=#refsXMLBASE>[XMLBASE]</a></p>
-
-    <p>For the purposes of the XML Base specification, user agents
-    must act as if all <code>Document</code> objects represented XML
-    documents.</p>
-
-    <p class=note>It is possible for <code title=attr-xml-base><a href=#the-xml:base-attribute-(xml-only)>xml:base</a></code> attributes to be present
-    even in HTML fragments, as such attributes can be added
-    dynamically using script. (Such scripts would not be conforming,
-    however, as <code title=attr-xml-base><a href=#the-xml:base-attribute-(xml-only)>xml:base</a></code> attributes
-    are not allowed in <a href=#html-documents>HTML documents</a>.)</p>
-
-    <p>The <dfn id=document-base-url>document base URL</dfn> of a <code>Document</code> is
-    the <a href=#absolute-url>absolute URL</a> obtained by running these
-    substeps:</p>
-
-    <ol><li><p>Let <var title="">fallback base url</var> be <a href="#the-document's-address">the
-     document's address</a>.</li>
-
-     <li>
-
-      <!-- http://www.hixie.ch/tests/adhoc/html/navigation/javascript-url/ -->
-
-      <!-- XXX this should be tested in the case of a browsing context
-      that was navigated to about:blank after having been elsewhere,
-      as opposed to the about:blank used at the time of the browsing
-      context's creation. -->
-
-      <p>If <var title="">fallback base url</var> is
-      <code><a href=#about:blank>about:blank</a></code>, and the <code>Document</code>'s
-      <a href=#browsing-context>browsing context</a> has a <a href=#creator-browsing-context>creator browsing
-      context</a>, then let <var title="">fallback base url</var>
-      be the <a href=#document-base-url>document base URL</a> of the <a href=#creator-document>creator
-      <code>Document</code></a> instead.</p>
-
-     </li>
-
-     <li><p>If there is no <code><a href=#the-base-element>base</a></code> element that is both a
-     child of <a href=#the-head-element-0>the <code>head</code> element</a> and has an
-     <code title=attr-base-href><a href=#attr-base-href>href</a></code> attribute, then the
-     <a href=#document-base-url>document base URL</a> is <var title="">fallback base
-     url</var>.</li>
-
-     <li><p>Otherwise, let <var title="">url</var> be the value of the
-     <code title=attr-base-href><a href=#attr-base-href>href</a></code> attribute of the first
-     such element.</li>
-
-     <li><p><a href=#resolve-a-url title="resolve a URL">Resolve</a> <var title="">url</var> relative to <var title="">fallback base
-     url</var> (thus, the <code><a href=#the-base-element>base</a></code> <code title=attr-base-href><a href=#attr-base-href>href</a></code> attribute isn't affected by
-     <code title=attr-xml-base><a href=#the-xml:base-attribute-(xml-only)>xml:base</a></code> attributes).</li>
-
-     <li><p>The <a href=#document-base-url>document base URL</a> is the result of the
-     previous step if it was successful; otherwise it is <var title="">fallback base url</var>.</li>
-
-    </ol></li>
-
-   <li><p><a href=#parse-a-url title="parse a URL">Parse</a> <var title="">url</var> into its component parts.</li>
-
-   <li>
-
-    <p>If parsing <var title="">url</var> resulted in a <a href=#url-host title=url-host><host></a> component, then replace the
-    matching substring of <var title="">url</var> with the string that
-    results from expanding any sequences of percent-encoded octets in
-    that component that are valid UTF-8 sequences into Unicode
-    characters as defined by UTF-8.</p>
-
-    <p>If any percent-encoded octets in that component are not valid
-    UTF-8 sequences, then return an error and abort these steps.</p>
-
-    <p>Apply the IDNA ToASCII algorithm to the matching substring,
-    with both the AllowUnassigned and UseSTD3ASCIIRules flags
-    set. Replace the matching substring with the result of the ToASCII
-    algorithm.</p>
-
-    <p>If ToASCII fails to convert one of the components of the
-    string, e.g. because it is too long or because it contains invalid
-    characters, then return an error and abort these steps. <a href=#refsRFC3490>[RFC3490]</a></p>
-
-   </li>
-
-   <li>
-
-    <p>If parsing <var title="">url</var> resulted in a <a href=#url-path title=url-path><path></a> component, then replace the
-    matching substring of <var title="">url</var> with the string that
-    results from applying the following steps to each character other
-    than U+0025 PERCENT SIGN (%) that doesn't match the original
-    <path> production defined in RFC 3986:</p>
-
-    <ol><li>Encode the character into a sequence of octets as defined by
-     UTF-8.</li>
-
-     <li>Replace the character with the percent-encoded form of those
-     octets. <a href=#refsRFC3986>[RFC3986]</a></li>
-
-    </ol><div class=example>
-
-     <p>For instance if <var title="">url</var> was "<code title="">//example.com/a^b☺c%FFd%z/?e</code>", then the
-     <a href=#url-path title=url-path><path></a> component's substring
-     would be "<code title="">/a^b☺c%FFd%z/</code>" and the two
-     characters that would have to be escaped would be "<code title="">^</code>" and "<code title="">☺</code>". The
-     result after this step was applied would therefore be that <var title="">url</var> now had the value "<code title="">//example.com/a%5Eb%E2%98%BAc%FFd%z/?e</code>".</p>
-
-    </div>
-
-   </li>
-
-   <li>
-
-    <p>If parsing <var title="">url</var> resulted in a <a href=#url-query title=url-query><query></a> component, then replace the
-    matching substring of <var title="">url</var> with the string that
-    results from applying the following steps to each character other
-    than U+0025 PERCENT SIGN (%) that doesn't match the original
-    <query> production defined in RFC 3986:</p>
-
-    <ol><li>If the character in question cannot be expressed in the
-     encoding <var title="">encoding</var>, then replace it with a
-     single 0x3F octet (an ASCII question mark) and skip the remaining
-     substeps for this character.</li>
-
-     <li>Encode the character into a sequence of octets as defined by
-     the encoding <var title="">encoding</var>.</li>
-
-     <li>Replace the character with the percent-encoded form of those
-     octets. <a href=#refsRFC3986>[RFC3986]</a></li>
-
-    </ol></li>
-
-   <li><p>Apply the algorithm described in RFC 3986 section 5.2
-   Relative Resolution, using <var title="">url</var> as the
-   potentially relative URI reference (<var title="">R</var>), and
-   <var title="">base</var> as the base URI (<var title="">Base</var>). <a href=#refsRFC3986>[RFC3986]</a></li>
-
-   <li>
-
-    <p>Apply any relevant conformance criteria of RFC 3986 and RFC
-    3987, returning an error and aborting these steps if
-    appropriate. <a href=#refsRFC3986>[RFC3986]</a> <a href=#refsRFC3987>[RFC3987]</a></p>
-
-    <p class=example>For instance, if an absolute URI that would be
-    returned by the above algorithm violates the restrictions specific
-    to its scheme, e.g. a <code title="">data:</code> URI using the
-    "<code title="">//</code>" server-based naming authority syntax,
-    then user agents are to treat this as an error instead.<!-- RFC
-    3986, 3.1 Scheme --></p>
-
-   </li>
-
-   <li><p>Let <var title="">result</var> be the target URI (<var title="">T</var>) returned by the Relative Resolution
-   algorithm.</li>
-
-   <li><p>If <var title="">result</var> uses a scheme with a
-   server-based naming authority, replace all U+005C REVERSE SOLIDUS
-   (\) characters in <var title="">result</var> with U+002F SOLIDUS
-   (/) characters.</li>
-
-   <li><p>Return <var title="">result</var>.</li>
-
-  </ol><p>A <a href=#url>URL</a> is an <dfn id=absolute-url>absolute URL</dfn> if <a href=#resolve-a-url title="resolve a URL">resolving</a> it results in the same
-  URL without an error.</p>
-
-  </div>
-
-
-  <div class=impl>
-
-  <h4 id=dynamic-changes-to-base-urls><span class=secno>2.5.4 </span>Dynamic changes to base URLs</h4>
-
   <p>When an <code title=attr-xml-base><a href=#the-xml:base-attribute-(xml-only)>xml:base</a></code> attribute
   changes, the attribute's element, and all descendant elements, are
   <a href=#affected-by-a-base-url-change>affected by a base URL change</a>.</p>
@@ -4832,7 +4502,7 @@
 
 
 
-  <h4 id=interfaces-for-url-manipulation><span class=secno>2.5.5 </span>Interfaces for URL manipulation</h4>
+  <h4 id=interfaces-for-url-manipulation><span class=secno>2.5.3 </span>Interfaces for URL manipulation</h4>
 
   <p>An interface that has a complement of <dfn id=url-decomposition-attributes>URL decomposition
   attributes</dfn> will have seven attributes with the following

Modified: source
===================================================================
--- source	2009-06-13 23:51:32 UTC (rev 3244)
+++ source	2009-06-14 00:21:04 UTC (rev 3245)
@@ -3933,64 +3933,52 @@
 
   <h3>URLs</h3>
 
-  <p>This specification defines the term <span>URL</span>, and defines
-  various algorithms for dealing with URLs, because for historical
-  reasons the rules defined by the URI and IRI specifications are not
-  a complete description of what HTML user agents need to implement to
-  be compatible with Web content.</p>
-
-
   <h4>Terminology</h4>
 
   <p>A <dfn>URL</dfn> is a string used to identify a resource.</p>
 
-  <p>A <span>URL</span> is a <dfn>valid URL</dfn> if at least one of
-  the following conditions holds:</p>
+  <p>A <span>URL</span> is a <dfn>valid URL</dfn> if it is a
+  <span>valid Web address</span> as defined by the Web addresses
+  specification. <a href="#refsWEBADDRESSES">[WEBADDRESSES]</a></p>
 
-  <ul>
+  <p>A <span>URL</span> is an <dfn>absolute URL</dfn> if it is an
+  <span>absolute Web address</span> as defined by the Web addresses
+  specification. <a href="#refsWEBADDRESSES">[WEBADDRESSES]</a></p>
 
-   <li><p>The <span>URL</span> is a valid URI reference <a
-   href="#refsRFC3986">[RFC3986]</a>.</p></li>
-
-   <li><p>The <span>URL</span> is a valid IRI reference and it has no
-   query component. <a href="#refsRFC3987">[RFC3987]</a></p></li>
-
-   <li><p>The <span>URL</span> is a valid IRI reference and its query
-   component contains no unescaped non-ASCII characters. <a
-   href="#refsRFC3987">[RFC3987]</a></p></li>
-
-   <li><p>The <span>URL</span> is a valid IRI reference and the <span
-   title="document's character encoding">character encoding</span> of
-   the URL's <code>Document</code> is UTF-8 or UTF-16. <a
-   href="#refsRFC3987">[RFC3987]</a></p></li>
-
-  </ul>
-
   <div class="impl">
 
-  <p>A <span>URL</span> has an associated <dfn>URL character
-  encoding</dfn>, determined as follows:</p>
+  <p>To <dfn>parse a URL</dfn> <var title="">url</var> into its
+  component parts, the user agent must use the <span>parse a Web
+  address</span> algorithm defined by the Web addresses
+  specification. <a href="#refsWEBADDRESSES">[WEBADDRESSES]</a></p>
 
-  <dl class="switch">
+  <p>Parsing a URL results in the following components, again as
+  defined by the Web addresses specification:</p>
 
-   <dt>If the URL came from a script (e.g. as an argument to a
-   method)</dt>
+  <ul class="brief">
+   <li><dfn title="url-scheme"><scheme></dfn></li>
+   <li><dfn title="url-host"><host></dfn></li>
+   <li><dfn title="url-port"><port></dfn></li>
+   <li><dfn title="url-hostport"><hostport></dfn></li>
+   <li><dfn title="url-path"><path></dfn></li>
+   <li><dfn title="url-query"><query></dfn></li>
+   <li><dfn title="url-fragment"><fragment></dfn></li>
+   <li><dfn title="url-host-specific"><host-specific></dfn></li>
+  </ul> 
 
-   <dd>The URL character encoding is the <span>script's URL character
-   encoding</span>.</dd>
+  <p>To <dfn>resolve a URL</dfn> to an <span>absolute URL</span>
+  relative to either another <span>absolute URL</span> or an element,
+  the user agent must use the <span>resolve a Web address</span>
+  algorithm defined by the Web addresses specification. <a
+  href="#refsWEBADDRESSES">[WEBADDRESSES]</a></p>
 
-   <dt>If the URL came from a DOM node (e.g. from an element)</dt>
+  <p>The <dfn>document base URL</dfn> of a <code>Document</code>
+  object is the <span>document base Web address</span> as defined by
+  the Web addresses specification. <a
+  href="#refsWEBADDRESSES">[WEBADDRESSES]</a></p>
 
-   <dd>The node has a <code>Document</code>, and the URL character
-   encoding is the <span>document's character encoding</span>.</dd>
+  </div>
 
-   <dt>If the URL had a character encoding defined when the URL was
-   created or defined</dt>
-
-   <dd>The URL character encoding is as defined.</dd>
-
-  </dl>
-
   <p class="note">The term "URL" in this specification is used in a
   manner distinct from the precise technical meaning it is given in
   RFC 3986. Readers familiar with that RFC will find it easier to read
@@ -3999,383 +3987,9 @@
   <span>willful violation</span> of RFC 3986. <a
   href="#refsRFC3986">[RFC3986]</a></p>
 
-  </div>
 
-
   <div class="impl">
 
-  <h4>Parsing URLs</h4>
-
-  <p>To <dfn>parse a URL</dfn> <var title="">url</var> into its
-  component parts, the user agent must use the following steps:</p>
-
-  <ol>
-
-   <li><p>Strip leading and trailing <span title="space
-   character">space characters</span> from <var
-   title="">url</var>.</p></li>
-
-   <li>
-
-    <p>Parse <var title="">url</var> in the manner defined by RFC
-    3986, with the following exceptions:</p>
-
-    <ul>
-
-     <li>Add all characters with code points less than or equal to
-     U+0020 or greater than or equal to U+007F to the
-     <unreserved> production.</li>
-
-     <li>Add the characters U+0022, U+003C, U+003E, U+005B .. U+005E,
-     U+0060, and U+007B .. U+007D to the <unreserved>
-     production.
-      <!--
-       0022 QUOTATION MARK
-       003C LESS-THAN SIGN
-       003E GREATER-THAN SIGN
-       005B LEFT SQUARE BRACKET
-       005C REVERSE SOLIDUS
-       005D RIGHT SQUARE BRACKET
-       005E CIRCUMFLEX ACCENT
-       0060 GRAVE ACCENT
-       007B LEFT CURLY BRACKET
-       007C VERTICAL LINE
-       007D RIGHT CURLY BRACKET
-      -->
-     </li>
-
-     <li>Add a single U+0025 PERCENT SIGN character as a second
-     alternative way of matching the <pct-encoded> production,
-     except when the <pct-encoded> is used in the
-     <reg-name> production.</li>
-
-     <li>Add the U+0023 NUMBER SIGN character to the characters
-     allowed in the <fragment> production.</li>
-
-     <!-- some browsers also have other differences, e.g. Mozilla
-     seems to treat ";" as if it was not in sub-delims, if the scheem
-     is "ftp". -->
-
-    </ul>
-
-   </li>
-
-   <li>
-
-    <p>If <var title="">url</var> doesn't match the
-    <URI-reference> production, even after the above changes are
-    made to the ABNF definitions, then parsing the URL fails with an
-    error. <a href="#refsRFC3986">[RFC3986]</a></p>
-
-    <p>Otherwise, parsing <var title="">url</var> was successful; the
-    components of the URL are substrings of <var title="">url</var>
-    defined as follows:</p>
-
-    <dl>
-
-     <dt><dfn title="url-scheme"><scheme></dfn></dt>
-
-     <dd><p>The substring matched by the <scheme> production, if any.</p></dd>
-
-
-     <dt><dfn title="url-host"><host></dfn></dt>
-
-     <dd><p>The substring matched by the <host> production, if any.</p></dd>
-
-
-     <dt><dfn title="url-port"><port></dfn></dt>
-
-     <dd><p>The substring matched by the <port> production, if any.</p></dd>
-
-
-     <dt><dfn title="url-hostport"><hostport></dfn></dt>
-
-     <dd><p>If there is a <scheme> component and a <port>
-     component and the port given by the <port> component is
-     different than the default port defined for the protocol given by
-     the <scheme> component, then <hostport> is the
-     substring that starts with the substring matched by the
-     <host> production and ends with the substring matched by the
-     <port> production, and includes the colon in between the
-     two. Otherwise, it is the same as the <host> component.</p>
-
-
-     <dt><dfn title="url-path"><path></dfn></dt>
-
-     <dd>
-
-      <p>The substring matched by one of the following productions, if
-      one of them was matched:</p>
-
-      <ul class="brief">
-       <li><path-abempty></li>
-       <li><path-absolute></li>
-       <li><path-noscheme></li>
-       <li><path-rootless></li>
-       <li><path-empty></li>
-      </ul>
-
-     </dd>
-
-
-     <dt><dfn title="url-query"><query></dfn></dt>
-
-     <dd><p>The substring matched by the <query> production, if any.</p></dd>
-
-
-     <dt><dfn title="url-fragment"><fragment></dfn></dt>
-
-     <dd><p>The substring matched by the <fragment> production, if any.</p></dd>
-
-
-     <dt><dfn title="url-host-specific"><host-specific></dfn></dt>
-
-     <dd><p>The substring that <em>follows</em> the substring matched
-     by the <authority> production, or the whole string if the
-     <authority> production wasn't matched.</p></dd>
-
-    </dl>
-
-   </li>
-
-  </ol>
-
-  <p class="note">These parsing rules are a <span>willful
-  violation</span> of RFC 3986 and RFC 3987 (which do not define error
-  handling), motivated by a desire to handle legacy content. <a
-  href="#refsRFC3986">[RFC3986]</a> <a
-  href="#refsRFC3987">[RFC3987]</a></p>
-
-  </div>
-
-
-
-  <div class="impl">
-
-  <h4>Resolving URLs</h4>
-
-  <p>To <dfn>resolve a URL</dfn> to an <span>absolute URL</span>
-  relative to either another <span>absolute URL</span> or an element,
-  the user agent must use the following steps. Resolving a URL can
-  result in an error, in which case the URL is not resolvable.</p>
-
-  <ol>
-
-   <li><p>Let <var title="">url</var> be the <span>URL</span> being
-   resolved.</p></li>
-
-   <li><p>Let <var title="">encoding</var> be the <span>URL character
-   encoding</span>.</p></li>
-
-   <li><p>If <var title="">encoding</var> is a UTF-16 encoding, then
-   change the value of <var title="">encoding</var> to UTF-8.</p></li>
-
-   <li>
-
-    <p>If the algorithm was invoked with an <span>absolute URL</span>
-    to use as the base URL, let <var title="">base</var> be that
-    <span>absolute URL</span>.</p>
-
-    <p>Otherwise, let <var title="">base</var> be the <i>base URI of
-    the element</i>, as defined by the XML Base specification, with
-    <i>the base URI of the document entity</i> being defined as the
-    <span>document base URL</span> of the <code>Document</code> that
-    owns the element. <a href="#refsXMLBASE">[XMLBASE]</a></p>
-
-    <p>For the purposes of the XML Base specification, user agents
-    must act as if all <code>Document</code> objects represented XML
-    documents.</p>
-
-    <p class="note">It is possible for <code
-    title="attr-xml-base">xml:base</code> attributes to be present
-    even in HTML fragments, as such attributes can be added
-    dynamically using script. (Such scripts would not be conforming,
-    however, as <code title="attr-xml-base">xml:base</code> attributes
-    are not allowed in <span>HTML documents</span>.)</p>
-
-    <p>The <dfn>document base URL</dfn> of a <code>Document</code> is
-    the <span>absolute URL</span> obtained by running these
-    substeps:</p>
-
-    <ol>
-
-     <li><p>Let <var title="">fallback base url</var> be <span>the
-     document's address</span>.</p></li>
-
-     <li>
-
-      <!-- http://www.hixie.ch/tests/adhoc/html/navigation/javascript-url/ -->
-
-      <!-- XXX this should be tested in the case of a browsing context
-      that was navigated to about:blank after having been elsewhere,
-      as opposed to the about:blank used at the time of the browsing
-      context's creation. -->
-
-      <p>If <var title="">fallback base url</var> is
-      <code>about:blank</code>, and the <code>Document</code>'s
-      <span>browsing context</span> has a <span>creator browsing
-      context</span>, then let <var title="">fallback base url</var>
-      be the <span>document base URL</span> of the <span>creator
-      <code>Document</code></span> instead.</p>
-
-     </li>
-
-     <li><p>If there is no <code>base</code> element that is both a
-     child of <span>the <code>head</code> element</span> and has an
-     <code title="attr-base-href">href</code> attribute, then the
-     <span>document base URL</span> is <var title="">fallback base
-     url</var>.</p></li>
-
-     <li><p>Otherwise, let <var title="">url</var> be the value of the
-     <code title="attr-base-href">href</code> attribute of the first
-     such element.</p></li>
-
-     <li><p><span title="resolve a URL">Resolve</span> <var
-     title="">url</var> relative to <var title="">fallback base
-     url</var> (thus, the <code>base</code> <code
-     title="attr-base-href">href</code> attribute isn't affected by
-     <code title="attr-xml-base">xml:base</code> attributes).</p></li>
-
-     <li><p>The <span>document base URL</span> is the result of the
-     previous step if it was successful; otherwise it is <var
-     title="">fallback base url</var>.</p></li>
-
-    </ol>
-
-   </li>
-
-   <li><p><span title="parse a URL">Parse</span> <var
-   title="">url</var> into its component parts.</p></li>
-
-   <li>
-
-    <p>If parsing <var title="">url</var> resulted in a <span
-    title="url-host"><host></span> component, then replace the
-    matching substring of <var title="">url</var> with the string that
-    results from expanding any sequences of percent-encoded octets in
-    that component that are valid UTF-8 sequences into Unicode
-    characters as defined by UTF-8.</p>
-
-    <p>If any percent-encoded octets in that component are not valid
-    UTF-8 sequences, then return an error and abort these steps.</p>
-
-    <p>Apply the IDNA ToASCII algorithm to the matching substring,
-    with both the AllowUnassigned and UseSTD3ASCIIRules flags
-    set. Replace the matching substring with the result of the ToASCII
-    algorithm.</p>
-
-    <p>If ToASCII fails to convert one of the components of the
-    string, e.g. because it is too long or because it contains invalid
-    characters, then return an error and abort these steps. <a
-    href="#refsRFC3490">[RFC3490]</a></p>
-
-   </li>
-
-   <li>
-
-    <p>If parsing <var title="">url</var> resulted in a <span
-    title="url-path"><path></span> component, then replace the
-    matching substring of <var title="">url</var> with the string that
-    results from applying the following steps to each character other
-    than U+0025 PERCENT SIGN (%) that doesn't match the original
-    <path> production defined in RFC 3986:</p>
-
-    <ol>
-
-     <li>Encode the character into a sequence of octets as defined by
-     UTF-8.</li>
-
-     <li>Replace the character with the percent-encoded form of those
-     octets. <a href="#refsRFC3986">[RFC3986]</a></li>
-
-    </ol>
-
-    <div class="example">
-
-     <p>For instance if <var title="">url</var> was "<code
-     title="">//example.com/a^b&#x263a;c%FFd%z/?e</code>", then the
-     <span title="url-path"><path></span> component's substring
-     would be "<code title="">/a^b&#x263a;c%FFd%z/</code>" and the two
-     characters that would have to be escaped would be "<code
-     title="">^</code>" and "<code title="">&#x263a;</code>". The
-     result after this step was applied would therefore be that <var
-     title="">url</var> now had the value "<code
-     title="">//example.com/a%5Eb%E2%98%BAc%FFd%z/?e</code>".</p>
-
-    </div>
-
-   </li>
-
-   <li>
-
-    <p>If parsing <var title="">url</var> resulted in a <span
-    title="url-query"><query></span> component, then replace the
-    matching substring of <var title="">url</var> with the string that
-    results from applying the following steps to each character other
-    than U+0025 PERCENT SIGN (%) that doesn't match the original
-    <query> production defined in RFC 3986:</p>
-
-    <ol>
-
-     <li>If the character in question cannot be expressed in the
-     encoding <var title="">encoding</var>, then replace it with a
-     single 0x3F octet (an ASCII question mark) and skip the remaining
-     substeps for this character.</li>
-
-     <li>Encode the character into a sequence of octets as defined by
-     the encoding <var title="">encoding</var>.</li>
-
-     <li>Replace the character with the percent-encoded form of those
-     octets. <a href="#refsRFC3986">[RFC3986]</a></li>
-
-    </ol>
-
-   </li>
-
-   <li><p>Apply the algorithm described in RFC 3986 section 5.2
-   Relative Resolution, using <var title="">url</var> as the
-   potentially relative URI reference (<var title="">R</var>), and
-   <var title="">base</var> as the base URI (<var
-   title="">Base</var>). <a href="#refsRFC3986">[RFC3986]</a></p></li>
-
-   <li>
-
-    <p>Apply any relevant conformance criteria of RFC 3986 and RFC
-    3987, returning an error and aborting these steps if
-    appropriate. <a href="#refsRFC3986">[RFC3986]</a> <a
-    href="#refsRFC3987">[RFC3987]</a></p>
-
-    <p class="example">For instance, if an absolute URI that would be
-    returned by the above algorithm violates the restrictions specific
-    to its scheme, e.g. a <code title="">data:</code> URI using the
-    "<code title="">//</code>" server-based naming authority syntax,
-    then user agents are to treat this as an error instead.<!-- RFC
-    3986, 3.1 Scheme --></p>
-
-   </li>
-
-   <li><p>Let <var title="">result</var> be the target URI (<var
-   title="">T</var>) returned by the Relative Resolution
-   algorithm.</p></li>
-
-   <li><p>If <var title="">result</var> uses a scheme with a
-   server-based naming authority, replace all U+005C REVERSE SOLIDUS
-   (\) characters in <var title="">result</var> with U+002F SOLIDUS
-   (/) characters.</p></li>
-
-   <li><p>Return <var title="">result</var>.</p></li>
-
-  </ol>
-
-  <p>A <span>URL</span> is an <dfn>absolute URL</dfn> if <span
-  title="resolve a URL">resolving</span> it results in the same
-  URL without an error.</p>
-
-  </div>
-
-
-  <div class="impl">
-
   <h4>Dynamic changes to base URLs</h4>
 
   <p>When an <code title="attr-xml-base">xml:base</code> attribute




More information about the Commit-Watchers mailing list