[html5] r1795 - [] (0) Define how to resolve a relative URL. Work in progress; not yet integrate [...]

whatwg at whatwg.org whatwg at whatwg.org
Tue Jun 24 14:21:47 PDT 2008


Author: ianh
Date: 2008-06-24 14:21:46 -0700 (Tue, 24 Jun 2008)
New Revision: 1795

Modified:
   index
   source
Log:
[] (0) Define how to resolve a relative URL. Work in progress; not yet integrated with the rest of the spec.

Modified: index
===================================================================
--- index	2008-06-24 01:11:35 UTC (rev 1794)
+++ index	2008-06-24 21:21:46 UTC (rev 1795)
@@ -216,16 +216,13 @@
        <li><a href="#terminology0"><span class=secno>2.3.1
         </span>Terminology</a>
 
-       <li><a href="#parsing0"><span class=secno>2.3.2 </span>Parsing
+       <li><a href="#resolving"><span class=secno>2.3.2 </span>Resolving
         URLs</a>
 
-       <li><a href="#resolving"><span class=secno>2.3.3 </span>Resolving
-        URLs</a>
-
-       <li><a href="#open-issues"><span class=secno>2.3.4 </span>Open
+       <li><a href="#open-issues"><span class=secno>2.3.3 </span>Open
         issues</a>
 
-       <li><a href="#interfaces"><span class=secno>2.3.5 </span>Interfaces
+       <li><a href="#interfaces"><span class=secno>2.3.4 </span>Interfaces
         for URI manipulation</a>
       </ul>
 
@@ -1176,7 +1173,7 @@
          <li><a href="#writing"><span class=secno>5.7.3.1. </span>Writing
           cache manifests</a>
 
-         <li><a href="#parsing1"><span class=secno>5.7.3.2. </span>Parsing
+         <li><a href="#parsing0"><span class=secno>5.7.3.2. </span>Parsing
           cache manifests</a>
         </ul>
 
@@ -1555,7 +1552,7 @@
        <li><a href="#connecting"><span class=secno>7.2.2 </span>Connecting to
         an event stream</a>
 
-       <li><a href="#parsing2"><span class=secno>7.2.3 </span>Parsing an
+       <li><a href="#parsing1"><span class=secno>7.2.3 </span>Parsing an
         event stream</a>
 
        <li><a href="#event-stream-interpretation"><span class=secno>7.2.4
@@ -1814,7 +1811,7 @@
      <li><a href="#serializing"><span class=secno>9.4 </span>Serializing HTML
       fragments</a>
 
-     <li><a href="#parsing3"><span class=secno>9.5 </span>Parsing HTML
+     <li><a href="#parsing2"><span class=secno>9.5 </span>Parsing HTML
       fragments</a>
 
      <li><a href="#named"><span class=secno>9.6 </span>Named character
@@ -2407,21 +2404,6 @@
      uses an XML serialization with namespaces. <a href="#refsXML">[XML]</a>
      <a href="#refsXMLNAMES">[XMLNAMES]</a></p>
 
-   <dt>XML Base
-
-   <dd> <!-- XXXURL remove entire entry, define it all in URLs section -->
-    <p id=xmlBase>User agents must follow the rules given by XML Base to
-     resolve relative URIs in HTML and XHTML fragments. That is the mechanism
-     used in this specification for resolving relative URIs in DOM trees. <a
-     href="#refsXMLBASE">[XMLBASE]</a></p>
-
-    <p class=note>It is possible for <code title=attr-xml-base><a
-     href="#xmlbase">xml:base</a></code> attributes to be present even in
-     HTML fragments, as such attributes can be added dynamically using
-     script. (Such scripts would not be conforming, however, as <code
-     title=attr-xml-base><a href="#xmlbase">xml:base</a></code> attributes as
-     not allowed in <a href="#html-">HTML documents</a>.)</p>
-
    <dt>DOM
 
    <dd>
@@ -2686,6 +2668,12 @@
   <h3 id=urls><span class=secno>2.3 </span>URLs</h3>
   <!-- XXXURL -->
 
+  <p>This specification defines the term <a href="#url">URL</a>, and defines
+   various algorithms for dealing with URLs, because for historical reasons
+   the rules defined by the URI and IRI specifications are not a complete
+   description of what HTML user agents need to implement to be compatible
+   with Web content.
+
   <p class=big-issue>The text in this section is not yet integrated with the
    rest of the specification.
 
@@ -2723,40 +2711,243 @@
      href="#refsRFC3987">[RFC3987]</a>
   </ul>
 
-  <p>A <a href="#url">URL</a> is a <dfn id=valid0>valid absolute URL</dfn> if
-   it is a <a href="#valid">valid URL</a> and has an absolute form. <a
-   href="#refsRFC3986">[RFC3986]</a> <a href="#refsRFC3987">[RFC3987]</a>
+  <h4 id=resolving><span class=secno>2.3.2 </span>Resolving URLs</h4>
 
-  <h4 id=parsing0><span class=secno>2.3.2 </span>Parsing URLs</h4>
+  <p>Relative URLs are resolved relative to a base URL. The <dfn
+   id=base->base URL</dfn> of a <a href="#url">URL</a> is the <a
+   href="#absolute">absolute URL</a> obtained as follows:
 
-  <p class=big-issue>...
+  <dl class=switch>
+   <dt>If the URL to be resolved was passed to an API
 
-  <h4 id=resolving><span class=secno>2.3.3 </span>Resolving URLs</h4>
+   <dd>
+    <p>The base URL is the <a href="#document0">document base URL</a> of the
+     script's <a href="#script4">script document context</a>.
 
-  <div class=big-issue>
-   <p>First parse it (we need to define that. For some schemes it's not per
-    spec -- e.g. apparently for ftp: we should split from hosts on ';'). Then
-    handle each bit as follows:</p>
+   <dt>If the URL to be resolved is from the value of a content attribute
 
-   <p>scheme: no further processing (treat %-escaped characters literally,
-    treat unicode characters as unicode characters).</p>
+   <dd>
+    <p>The base URL is the <i>base URI of the element</i> that the attribute
+     is on, as defined by the XML Base specification, with <i>the base URI of
+     the document entity</i> being defined as the <a
+     href="#document0">document base URL</a> of the <code>Document</code>
+     that owns the element.</p>
 
-   <p>host: expand %-encoded bytes to Unicode as UTF-8, treat unicode
-    characters as per IDN.</p>
+    <p>For the purposes of the XML Base specification, user agents must act
+     as if all <code>Document</code> objects represented XML documents.</p>
 
-   <p>path: don't expand %-encoded bytes. Re-encode unicode to UTF-8 and
-    percent-encode.</p>
+    <p class=note>It is possible for <code title=attr-xml-base><a
+     href="#xmlbase">xml:base</a></code> attributes to be present even in
+     HTML fragments, as such attributes can be added dynamically using
+     script. (Such scripts would not be conforming, however, as <code
+     title=attr-xml-base><a href="#xmlbase">xml:base</a></code> attributes
+     are not allowed in <a href="#html-">HTML documents</a>.)</p>
+  </dl>
 
-   <p>query: don't expand %-encoded bytes. Re-encode unicode to the page's
-    encoding. Do not percent-encode.</p>
-  </div>
+  <p>The <dfn id=document0>document base URL</dfn> of a <code>Document</code>
+   is the <a href="#absolute">absolute URL</a> obtained by running these
+   steps:
 
-  <p class=big-issue>define what it means to resolve a relative URL when the
-   base URL doesn't have a path hierarchy (e.g. data:, javascript:,
-   about:blank URLs)
+  <ol>
+   <li>
+    <p>If there is no <code><a href="#base">base</a></code> element that is
+     both a child of <a href="#the-head0">the <code>head</code> element</a>
+     and has an <code title=att-base-href>href</code> attribute, then the <a
+     href="#document0">document base URL</a> is <span>the document's
+     address</span><!-- XXXDOCURL -->.
 
-  <h4 id=open-issues><span class=secno>2.3.4 </span>Open issues</h4>
+   <li>
+    <p>Otherwise, let <var title="">url</var> be the value of the <code
+     title=att-base-href>href</code> attribute of the first such element.
 
+   <li>
+    <p><a href="#resolve" title="resolve a URL">Resolve</a> the <var
+     title="">url</var> URL, using <span>the document's
+     address</span><!-- XXXDOCURL --> as the <a href="#document0">document
+     base URL</a>.
+
+   <li>
+    <p>The <a href="#document0">document base URL</a> is the result of the
+     previous step if it was successful; otherwise it is <span>the document's
+     address</span><!-- XXXDOCURL -->.
+  </ol>
+
+  <p>To <dfn id=resolve>resolve a URL</dfn> to an <a
+   href="#absolute">absolute URL</a> the user agent must use the following
+   steps. Resolving a URL can result in an error, in which case the URL is
+   not resolvable.
+
+  <ol>
+   <li>
+    <p>Let <var title="">url</var> be the <a href="#url">URL</a> being
+     resolved.
+
+   <li>
+    <p>Let <var title="">document</var> be the <code>Document</code>
+     associated with <var title="">url</var>.
+
+   <li>
+    <p>Let <var title="">encoding</var> be the <a href="#character1"
+     title="document's character encoding">character encoding</a> of <var
+     title="">document</var>.
+
+   <li>
+    <p>Let <var title="">base</var> be the <a href="#base-">base URL</a> for
+     <var title="">url</var>. (This is an <a href="#absolute">absolute
+     URL</a>.)
+
+   <li>
+    <p>Strip leading and trailing <a href="#space" title="space
+     character">space characters</a> from <var title="">url</var>.
+
+   <li>
+    <p>Parse <var title="">url</var> in the manner defined by RFC 3986, with
+     the following exceptions:</p>
+
+    <ul>
+     <li>Add all characters with codepoints less than or equal to U+0020 or
+      greater than or equal to U+007F to the <unreserved> production.
+
+     <li>Add the characters U+0022, U+003C, U+003E, U+005B .. U+005E, U+0060,
+      and U+007B .. U+007D to the <unreserved> production. <!--
+       0022 QUOTATION MARK
+       003C LESS-THAN SIGN
+       003E GREATER-THAN SIGN
+       005B LEFT SQUARE BRACKET
+       005C REVERSE SOLIDUS
+       005D RIGHT SQUARE BRACKET
+       005E CIRCUMFLEX ACCENT
+       0060 GRAVE ACCENT
+       007B LEFT CURLY BRACKET
+       007C VERTICAL LINE
+       007D RIGHT CURLY BRACKET
+      -->
+      
+
+     <li>Add a single U+0025 PERCENT SIGN character as a second alternative
+      way of matching the <pct-encoded> production, except when the
+      <pct-encoded> is used in the <reg-name> production.
+
+     <li>Add the U+0023 NUMBER SIGN character to the characters allowed in
+      the <fragment> production.</li>
+     <!-- some browsers also have other differences, e.g. Mozilla
+     seems to treat ";" as if it was not in sub-delims, if the scheem
+     is "ftp". -->
+    </ul>
+
+    <p>If <var title="">url</var> doesn't match the <URI-reference>
+     production, even after the above changes are made to the ABNF
+     definitions, then return an error and abort these steps.</p>
+
+    <p>If parsing <var title="">url</var> was successful, then make a note of
+     which substrings of <var title="">url</var> matched each of the
+     following productions that was matched:</p>
+
+    <ul class=brief>
+     <li><host>
+
+     <li><path-abempty>
+
+     <li><path-absolute>
+
+     <li><path-noscheme>
+
+     <li><path-rootless>
+
+     <li><path-empty>
+
+     <li><query>
+    </ul>
+
+    <p>When subsequent steps refer to the <path> production, they are
+     referring to whichever of those productions whose names start with
+     "path-" was matched. (Only one at a time can be matched.)</p>
+
+   <li>
+    <p>If parsing <var title="">url</var> resulted in the <host>
+     production being matched, then replace the matching subtring of <var
+     title="">url</var> with the string that results from expanding any
+     sequences of percent-encoded octets in that component that are valid
+     UTF-8 sequences into Unicode characters as defined by UTF-8.</p>
+
+    <p>If any percent-encoded octets in that component are not valid UTF-8
+     sequences, then return an error and abort these steps.</p>
+
+    <p>Apply the IDNA ToASCII algorithm to the matching substring, with both
+     the AllowUnassigned and UseSTD3ASCIIRules flags set. Replace the
+     matching substring with the result of the ToASCII algorithm.</p>
+
+    <p>If ToASCII fails to convert one of the components of the string, e.g.
+     because it is too long or because it contains invalid characters, then
+     return an error and abort these steps. <a
+     href="#refsRFC3490">[RFC3490]</a></p>
+
+   <li>
+    <p>If parsing <var title="">url</var> resulted in the <path>
+     production being matched, then replace the matching substring of <var
+     title="">url</var> with the string that results from applying the
+     following steps to each character that doesn't match the original
+     <path> production defined in RFC 3986:</p>
+
+    <ol>
+     <li>Encode the character into a sequence of octets as defined by UTF-8.
+
+     <li>Replace the character with the percent-encoded form of those octets.
+    </ol>
+
+    <div class=example>
+     <p>For instance if <var title="">url</var> was "<code
+      title="">//example.com/a^b&#x263a;c%FFd/?e</code>", then there would
+      the substring matching the <path> production would be "<code
+      title="">/a^b&#x263a;c%FFd/</code>" and the two characters that would
+      have to be escaped would be "<code title="">^</code>" and "<code
+      title="">&#x263a;</code>". The result after this step was applied would
+      therefore be that <var title="">url</var> now had the value "<code
+      title="">//example.com/a%5Eb%E2%98%BAc%FFd/?e</code>".</p>
+    </div>
+
+   <li>
+    <p>If parsing <var title="">url</var> resulted in the <query>
+     production being matched, then replace the matching substring of <var
+     title="">url</var> with the string that results from applying the
+     following steps to each character that doesn't match the original
+     <query> production defined in RFC 3986:</p>
+
+    <ol>
+     <li>Encode the character into a sequence of octets as defined by the
+      encoding <var title="">encoding</var>.
+
+     <li>Replace the character with the percent-encoded form of those octets.
+    </ol>
+
+   <li>
+    <p>Apply the algorithm described in RFC 3986 section 5.2 Relative
+     Resolution, using <var title="">url</var> as the potentially relative
+     URI reference (<var title="">R</var>), and <var title="">base</var> as
+     the base URI (<var title="">Base</var>).
+
+   <li>
+    <p>Apply any relevant conformance criteria of RFC 3986 and RFC 3987,
+     returning an error and aborting these steps if appropriate.</p>
+
+    <p class=example>For instance, if an absolute URI that would be returned
+     by the above algorithm violates the restrictions specific to its scheme,
+     e.g. a <code title="">data:</code> URI using the "<code
+     title="">//</code>" naming authority syntax, then user agents are to
+     treat this as an error instead.<!-- RFC 3986, 3.1
+    Scheme --></p>
+
+   <li>
+    <p>Return the target URI (<var title="">T</var>) returned by the Relative
+     Resolution algorithm.
+  </ol>
+
+  <p>A <a href="#url">URL</a> is an <dfn id=absolute>absolute URL</dfn> if <a
+   href="#resolve" title="resolve a URL">resolving</a> it results in the same
+   URL without an error.
+
+  <h4 id=open-issues><span class=secno>2.3.3 </span>Open issues</h4>
+
   <div class=big-issue>
    <p>This section will do the following:</p>
 
@@ -2766,16 +2957,6 @@
     <li>get rid of references to <a href="#refsRFC3986">[RFC3986]</a> <a
      href="#refsRFC3987">[RFC3987]</a> outside this section
 
-    <li>define how to resolve relative URLs in markup attributes (using
-     XMLBase as defined elsewhere right now)
-
-    <li>define how to resolve relative URLs in APIs, using the <dfn
-     id=scripts>script's base URI</dfn> maybe
-
-    <li>define "an <dfn id=elements3>element's base URI</dfn>" and make the
-     various places that talk about a base URI in the context of an element
-     use that definition
-
     <li>make the language used to refer to resolving a base URI consistent
      throughout, maybe make it hyperlink to a definition each time
 
@@ -2826,7 +3007,7 @@
   </div>
   <!-- XXXURL change to URL -->
 
-  <h4 id=interfaces><span class=secno>2.3.5 </span>Interfaces for URI
+  <h4 id=interfaces><span class=secno>2.3.4 </span>Interfaces for URI
    manipulation</h4>
   <!-- XXXURL change to URL -->
 
@@ -3119,7 +3300,7 @@
 
   <h5 id=unsigned><span class=secno>2.4.3.1. </span>Unsigned integers</h5>
 
-  <p>A string is a <dfn id=valid1>valid non-negative integer</dfn> if it
+  <p>A string is a <dfn id=valid0>valid non-negative integer</dfn> if it
    consists of one of more characters in the range U+0030 DIGIT ZERO (0) to
    U+0039 DIGIT NINE (9).
 
@@ -3180,7 +3361,7 @@
 
   <h5 id=signed><span class=secno>2.4.3.2. </span>Signed integers</h5>
 
-  <p>A string is a <dfn id=valid2>valid integer</dfn> if it consists of one
+  <p>A string is a <dfn id=valid1>valid integer</dfn> if it consists of one
    of more characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE
    (9), optionally prefixed with a U+002D HYPHEN-MINUS ("-") character.
 
@@ -3258,7 +3439,7 @@
 
   <h5 id=real-numbers><span class=secno>2.4.3.3. </span>Real numbers</h5>
 
-  <p>A string is a <dfn id=valid3>valid floating point number</dfn> if it
+  <p>A string is a <dfn id=valid2>valid floating point number</dfn> if it
    consists of one of more characters in the range U+0030 DIGIT ZERO (0) to
    U+0039 DIGIT NINE (9), optionally with a single U+002E FULL STOP (".")
    character somewhere (either before these numbers, in between two numbers,
@@ -3384,7 +3565,7 @@
    <code><a href="#progress">progress</a></code> and <code><a
    href="#meter">meter</a></code> elements.
 
-  <p>A <dfn id=valid4>valid denominator punctuation character</dfn> is one of
+  <p>A <dfn id=valid3>valid denominator punctuation character</dfn> is one of
    the characters from the table below. There is <dfn id=a-value
    title="values associated with denominator punctuation characters">a value
    associated with each denominator punctuation character</dfn>, as shown in
@@ -3462,7 +3643,7 @@
     href="#refsUNICODE">[UNICODE]</a>
 
    <li>If there are still further characters in the string, and the next
-    character in the string is a <a href="#valid4">valid denominator
+    character in the string is a <a href="#valid3">valid denominator
     punctuation character</a>, set <var title="">denominator</var> to that
     character.
 
@@ -3485,7 +3666,7 @@
     sub-algorithm in step 9.
 
    <li>If there are still further characters in the string, and the next
-    character in the string is a <a href="#valid4">valid denominator
+    character in the string is a <a href="#valid3">valid denominator
     punctuation character</a>, return nothing and abort these steps.
 
    <li>If the string contains any other characters in the range U+0030 DIGIT
@@ -3520,7 +3701,7 @@
    <li>Parse <var title="">string</var> according to the <a
     href="#rules1">rules for parsing floating point number values</a>, to
     obtain <var title="">number</var>. This step cannot fail (<var
-    title="">string</var> is guaranteed to be a <a href="#valid3">valid
+    title="">string</var> is guaranteed to be a <a href="#valid2">valid
     floating point number</a>).
 
    <li>Return <var title="">number</var>.
@@ -3529,15 +3710,15 @@
   <h5 id=percentages-and-dimensions><span class=secno>2.4.3.5.
    </span>Percentages and dimensions</h5>
 
-  <p class=big-issue><dfn id=valid5>valid positive non-zero integers</dfn>
+  <p class=big-issue><dfn id=valid4>valid positive non-zero integers</dfn>
    <dfn id=rules2>rules for parsing dimension values</dfn> (only used by
    height/width on img, embed, object — lengths in css pixels or
    percentages)
 
   <h5 id=lists><span class=secno>2.4.3.6. </span>Lists of integers</h5>
 
-  <p>A <dfn id=valid6>valid list of integers</dfn> is a number of <a
-   href="#valid2" title="valid integer">valid integers</a> separated by
+  <p>A <dfn id=valid5>valid list of integers</dfn> is a number of <a
+   href="#valid1" title="valid integer">valid integers</a> separated by
    U+002C COMMA characters, with no other characters (e.g. no <a
    href="#space" title="space character">space characters</a>). In addition,
    there might be restrictions on the number of integers that can be given,
@@ -3828,7 +4009,7 @@
 
   <h5 id=specific><span class=secno>2.4.4.1. </span>Specific moments in time</h5>
 
-  <p>A string is a <dfn id=valid7>valid datetime</dfn> if it has four digits
+  <p>A string is a <dfn id=valid6>valid datetime</dfn> if it has four digits
    (representing the year), a literal hyphen, two digits (representing the
    month), a literal hyphen, two digits (representing the day), optionally
    some spaces, either a literal T or a space, optionally some more spaces,
@@ -3861,7 +4042,7 @@
    U+002B PLUS SIGN, and the minus U+002D (same as the hyphen).
 
   <div class=example>
-   <p>The following are some examples of dates written as <a href="#valid7"
+   <p>The following are some examples of dates written as <a href="#valid6"
     title="valid datetime">valid datetimes</a>.</p>
 
    <dl>
@@ -3914,7 +4095,7 @@
    user agent must apply the following algorithm to the string. This will
    either return a time in UTC, with associated timezone information for
    round tripping or display purposes, or nothing, indicating the value is
-   not a <a href="#valid7">valid datetime</a>. If at any point the algorithm
+   not a <a href="#valid6">valid datetime</a>. If at any point the algorithm
    says that it "fails", this means that it returns nothing.
 
   <ol>
@@ -4526,7 +4707,7 @@
 
   <h4 id=time-offsets><span class=secno>2.4.5 </span>Time offsets</h4>
 
-  <p class=big-issue><dfn id=valid8>valid time offset</dfn>, <dfn
+  <p class=big-issue><dfn id=valid7>valid time offset</dfn>, <dfn
    id=rules4>rules for parsing time offsets</dfn>, <dfn id=time-offset>time
    offset serialization rules</dfn>; in the format "5d4h3m2s1ms" or "3m 9.2s"
    or "00:00:00.00" or similar.
@@ -4723,7 +4904,7 @@
 
   <h4 id=syntax-references><span class=secno>2.4.8 </span>References</h4>
 
-  <p>A <dfn id=valid9>valid hash-name reference</dfn> to an element of type
+  <p>A <dfn id=valid8>valid hash-name reference</dfn> to an element of type
    <var title="">type</var> is a string consisting of a U+0023 NUMBER SIGN
    (<code title="">#</code>) character followed by a string which exactly
    matches the value of the <code title="">name</code> attribute of an
@@ -4819,7 +5000,7 @@
    the attribute is absent, then the default value must be returned instead,
    or 0 if there is no default value. On setting, the given value must be
    converted to the shortest possible string representing the number as a <a
-   href="#valid2">valid integer</a> in base ten and then that string must be
+   href="#valid1">valid integer</a> in base ten and then that string must be
    used as the new content attribute value.
 
   <p>If a reflecting DOM attribute is an <em>unsigned</em> integer type
@@ -4830,7 +5011,7 @@
    hand, it fails, or if the attribute is absent, the default value must be
    returned instead, or 0 if there is no default value. On setting, the given
    value must be converted to the shortest possible string representing the
-   number as a <a href="#valid1">valid non-negative integer</a> in base ten
+   number as a <a href="#valid0">valid non-negative integer</a> in base ten
    and then that string must be used as the new content attribute value.
 
   <p>If a reflecting DOM attribute is an unsigned integer type
@@ -4845,7 +5026,7 @@
    value. On setting, if the value is zero, the user agent must fire an
    <code>INDEX_SIZE_ERR</code> exception. Otherwise, the given value must be
    converted to the shortest possible string representing the number as a <a
-   href="#valid1">valid non-negative integer</a> in base ten and then that
+   href="#valid0">valid non-negative integer</a> in base ten and then that
    string must be used as the new content attribute value.
 
   <p>If a reflecting DOM attribute is a floating point number type
@@ -4870,7 +5051,7 @@
    other hand, it fails, or if the attribute is absent, the default value
    must be returned instead, or 0.0 if there is no default value. On setting,
    the given value must be converted to the shortest possible string
-   representing the number as a <a href="#valid3">valid floating point
+   representing the number as a <a href="#valid2">valid floating point
    number</a> in base ten and then that string must be used as the new
    content attribute value.
 
@@ -6222,7 +6403,7 @@
   readonly attribute <a href="#htmlcollection0">HTMLCollection</a> <a href="#links0" title=dom-document-links>links</a>;
   readonly attribute <a href="#htmlcollection0">HTMLCollection</a> <a href="#forms0" title=dom-document-forms>forms</a>;
   readonly attribute <a href="#htmlcollection0">HTMLCollection</a> <a href="#anchors" title=dom-document-anchors>anchors</a>;
-  readonly attribute <a href="#htmlcollection0">HTMLCollection</a> <a href="#scripts0" title=dom-document-scripts>scripts</a>;
+  readonly attribute <a href="#htmlcollection0">HTMLCollection</a> <a href="#scripts" title=dom-document-scripts>scripts</a>;
   NodeList <a href="#getelementsbyname" title=dom-document-getElementsByName>getElementsByName</a>(in DOMString elementName);
   NodeList <a href="#getelementsbyclassname" title=dom-document-getElementsByClassName>getElementsByClassName</a>(in DOMString classNames);
 
@@ -6600,7 +6781,7 @@
   <!-- XXX note that such elements are
   non-conforming -->
 
-  <p>The <dfn id=scripts0
+  <p>The <dfn id=scripts
    title=dom-document-scripts><code>scripts</code></dfn> attribute must
    return an <code><a href="#htmlcollection0">HTMLCollection</a></code>
    rooted at the <code>Document</code> node, whose filter matches only
@@ -8545,27 +8726,6 @@
    (except the <code><a href="#html">html</a></code> element and its <code
    title=attr-html-manifest><a href="#manifest">manifest</a></code>
    attribute).</p>
-  <!-- XXXURL move to URLs section -->
-
-  <p>User agents must use the value of the <code
-   title=att-base-href>href</code> attribute of the first <code><a
-   href="#base">base</a></code> element that is both a child of <a
-   href="#the-head0">the <code>head</code> element</a> and has an <code
-   title=att-base-href>href</code> attribute, if there is such an element, as
-   the document entity's base URI for the purposes of section 5.1.1 of RFC
-   3986 ("Establishing a Base URI": "Base URI Embedded in Content"). This
-   base URI from RFC 3986 is referred to by the algorithm given in XML Base,
-   which <a href="#xmlBase">is a normative part of this specification</a>. <a
-   href="#refsRFC3986">[RFC3986]</a></p>
-  <!-- XXXURL move to URLs section -->
-
-  <p>If the base URI given by this attribute is a relative URI, it must be
-   resolved relative to the higher-level base URIs (i.e. the base URI from
-   the encapsulating entity or the URI used to retrieve the entity) to obtain
-   an absolute base URI. All <code title=attr-xml-base><a
-   href="#xmlbase">xml:base</a></code> attributes must be ignored when
-   resolving relative URIs in this <code title=attr-base-href><a
-   href="#href">href</a></code> attribute.</p>
   <!-- XXXURL leave this here, but make it clearer -->
 
   <p class=note>If there are multiple <code><a href="#base">base</a></code>
@@ -8573,7 +8733,7 @@
    the first are ignored.
 
   <p>The <dfn id=target title=attr-base-target><code>target</code></dfn>
-   attribute, if specified, must contain a <a href="#valid11">valid browsing
+   attribute, if specified, must contain a <a href="#valid10">valid browsing
    context name or keyword</a>. User agents use this name when <a
    href="#following0">following hyperlinks</a>.
 
@@ -9409,8 +9569,8 @@
 
      <li>
       <p>Resolve the <var title="">url</var> value to an absolute URI using
-       <a href="#elements3" title="element's base URI">the base URI</a> of
-       the <code><a href="#meta0">meta</a></code> element.
+       <span title="element's base URI">the base URI</span> of the <code><a
+       href="#meta0">meta</a></code> element.
 
      <li>
       <p>Perform one or more of the following steps:</p>
@@ -9446,9 +9606,9 @@
      attribute must have a value consisting either of:
 
     <ul>
-     <li> just a <a href="#valid1">valid non-negative integer</a>, or
+     <li> just a <a href="#valid0">valid non-negative integer</a>, or
 
-     <li> a <a href="#valid1">valid non-negative integer</a>, followed by a
+     <li> a <a href="#valid0">valid non-negative integer</a>, followed by a
       U+003B SEMICOLON (<code title="">;</code>), followed by one or more <a
       href="#space" title="space character">space characters</a>, followed by
       either a U+0055 LATIN CAPITAL LETTER U or a U+0075 LATIN SMALL LETTER
@@ -11306,7 +11466,7 @@
    attribute is omitted, the list is an ascending list (1, 2, 3, ...).
 
   <p>The <dfn id=start0 title=attr-ol-start><code>start</code></dfn>
-   attribute, if present, must be a <a href="#valid2">valid integer</a>
+   attribute, if present, must be a <a href="#valid1">valid integer</a>
    giving the ordinal value of the first list item.
 
   <p>If the <code title=attr-ol-start><a href="#start0">start</a></code>
@@ -11491,7 +11651,7 @@
    element.
 
   <p>The <dfn id=value title=attr-li-value><code>value</code></dfn>
-   attribute, if present, must be a <a href="#valid2">valid integer</a>
+   attribute, if present, must be a <a href="#valid1">valid integer</a>
    giving the ordinal value of the list item.
 
   <p>If the <code title=attr-li-value><a href="#value">value</a></code>
@@ -12948,7 +13108,7 @@
   <p><strong>Author requirements</strong>: The <code
    title=attr-progress-max><a href="#max">max</a></code> and <code
    title=attr-progress-value><a href="#value1">value</a></code> attributes,
-   when present, must have values that are <a href="#valid3" title="valid
+   when present, must have values that are <a href="#valid2" title="valid
    floating point number">valid floating point numbers</a>. The <code
    title=attr-progress-max><a href="#max">max</a></code> attribute, if
    present, must have a value greater than zero. The <code
@@ -13163,7 +13323,7 @@
    title=attr-meter-max><a href="#max1">max</a></code>, and <code
    title=attr-meter-optimum><a href="#optimum">optimum</a></code> attributes
    are all optional. When present, they must have values that are <a
-   href="#valid3" title="valid floating point number">valid floating point
+   href="#valid2" title="valid floating point number">valid floating point
    numbers</a>, and their values must satisfy the following inequalities:
 
   <ul class=brief>
@@ -14426,14 +14586,14 @@
 
   <p>If present, the <code title=attr-mod-datetime><a
    href="#datetime1">datetime</a></code> attribute must be a <a
-   href="#valid7">valid datetime</a> value.
+   href="#valid6">valid datetime</a> value.
 
   <p>User agents must parse the <code title=attr-mod-datetime><a
    href="#datetime1">datetime</a></code> attribute according to the <a
    href="#datetime-parser">parse a string as a datetime value</a> algorithm.
    If that doesn't return a time, then the modification has no associated
    timestamp (the value is non-conforming; it is not a <a
-   href="#valid7">valid datetime</a>). Otherwise, the modification is marked
+   href="#valid6">valid datetime</a>). Otherwise, the modification is marked
    as having been made at the given datetime. User agents should use the
    associated timezone information to determine which timezone to present the
    given datetime in.
@@ -15446,7 +15606,7 @@
    will remain at the initial <code>about:blank</code><!-- XXX xref --> page.
 
   <p>The <dfn id=name1 title=attr-iframe-name><code>name</code></dfn>
-   attribute, if present, must be a <a href="#valid10">valid browsing context
+   attribute, if present, must be a <a href="#valid9">valid browsing context
    name</a>. When the browsing context is created, if the attribute is
    present, the <a href="#browsing2">browsing context name</a> must be set to
    the value of this attribute; otherwise, the <a href="#browsing2">browsing
@@ -16005,7 +16165,7 @@
    href="#type6">type</a></code> attributes must be present.
 
   <p>The <dfn id=name3 title=attr-object-name><code>name</code></dfn>
-   attribute, if present, must be a <a href="#valid10">valid browsing context
+   attribute, if present, must be a <a href="#valid9">valid browsing context
    name</a>.
 
   <p>When the element is created, and subsequently whenever the <code
@@ -16950,7 +17110,7 @@
    allows the author to specify the pixel ratio of anamorphic <a
    href="#media10" title="media resource">media resources</a> that do not
    self-describe their pixel ratio. The attribute value, if specified, must
-   be a <a href="#valid3">valid floating point number</a> giving the ratio of
+   be a <a href="#valid2">valid floating point number</a> giving the ratio of
    the correct rendered width of each pixel to the actual width of each pixel
    in the image (i.e., the multiple by which the video's intrinsic width is
    to be multiplied to obtain the rendered width that gives the correct
@@ -19336,7 +19496,7 @@
    to control the size of the coordinate space: <dfn id=width0
    title=attr-canvas-width><code>width</code></dfn> and <dfn id=height0
    title=attr-canvas-height><code>height</code></dfn>. These attributes, when
-   specified, must have values that are <a href="#valid1" title="valid
+   specified, must have values that are <a href="#valid0" title="valid
    non-negative integer">valid non-negative integers</a>. The <a
    href="#rules">rules for parsing non-negative integers</a> must be used to
    obtain their numeric values. If an attribute is missing, or if parsing its
@@ -22100,7 +22260,7 @@
    href="#rectangle" title=attr-area-shape-rect>rectangle</a> state.
 
   <p>The <dfn id=coords title=attr-area-coords><code>coords</code></dfn>
-   attribute must, if specified, contain a <a href="#valid6">valid list of
+   attribute must, if specified, contain a <a href="#valid5">valid list of
    integers</a>. This attribute gives the coordinates for the shape described
    by the <code title=attr-area-shape><a href="#shape">shape</a></code>
    attribute. The processing for this attribute is described as part of the
@@ -22230,7 +22390,7 @@
    <code><a href="#img">img</a></code> or <code><a
    href="#object">object</a></code> element. The <code
    title=attr-area-usemap>usemap</code> attribute, if specified, must be a <a
-   href="#valid9">valid hash-name reference</a> to a <code><a
+   href="#valid8">valid hash-name reference</a> to a <code><a
    href="#map">map</a></code> element.
 
   <p>If an <code><a href="#img">img</a></code> element or an <code><a
@@ -22552,7 +22712,7 @@
    give the dimensions of the visual content of the element (the width and
    height respectively, relative to the nominal direction of the output
    medium), in CSS pixels. The attributes, if specified, must have values
-   that are <a href="#valid5">valid positive non-zero integers</a>.
+   that are <a href="#valid4">valid positive non-zero integers</a>.
 
   <p>The specified dimensions given may differ from the dimensions specified
    in the resource itself, since the resource may have a resolution that
@@ -22965,7 +23125,7 @@
   <p>If the <code><a href="#colgroup">colgroup</a></code> element contains no
    <code><a href="#col">col</a></code> elements, then the element may have a
    <dfn id=span0 title=attr-colgroup-span><code>span</code></dfn> content
-   attribute specified, whose value must be a <a href="#valid1">valid
+   attribute specified, whose value must be a <a href="#valid0">valid
    non-negative integer</a> greater than zero.
 
   <p>The <code><a href="#colgroup">colgroup</a></code> element and its <code
@@ -23019,7 +23179,7 @@
 
   <p>The element may have a <dfn id=span2
    title=attr-col-span><code>span</code></dfn> content attribute specified,
-   whose value must be a <a href="#valid1">valid non-negative integer</a>
+   whose value must be a <a href="#valid0">valid non-negative integer</a>
    greater than zero.
 
   <p>The <code><a href="#col">col</a></code> element and its <code
@@ -23492,13 +23652,13 @@
   <p>The <code><a href="#td">td</a></code> and <code><a
    href="#th">th</a></code> elements may have a <dfn id=colspan
    title=attr-tdth-colspan><code>colspan</code></dfn> content attribute
-   specified, whose value must be a <a href="#valid1">valid non-negative
+   specified, whose value must be a <a href="#valid0">valid non-negative
    integer</a> greater than zero.
 
   <p>The <code><a href="#td">td</a></code> and <code><a
    href="#th">th</a></code> elements may also have a <dfn id=rowspan
    title=attr-tdth-rowspan><code>rowspan</code></dfn> content attribute
-   specified, whose value must be a <a href="#valid1">valid non-negative
+   specified, whose value must be a <a href="#valid0">valid non-negative
    integer</a>.
 
   <p>The <code><a href="#td">td</a></code> and <code><a
@@ -26918,9 +27078,9 @@
     href="#rowspecification">RowSpecification</a></code> object representing
     the row in question. The return value is a string representing a valid
     URI (or IRI) to an image. Relative URIs must be interpreted relative to
-    the <code><a href="#datagrid0">datagrid</a></code>'s <a href="#elements3"
-    title="element's base URI">base URI</a>. If the method returns the empty
-    string, null, or if the method is not defined, then the row has no
+    the <code><a href="#datagrid0">datagrid</a></code>'s <span
+    title="element's base URI">base URI</span>. If the method returns the
+    empty string, null, or if the method is not defined, then the row has no
     associated image.
 
    <dt>To obtain a context menu appropriate for a particular row
@@ -28257,8 +28417,8 @@
    from the element's <code>src</code> attribute. <!--If it is
   an <code>object</code> element then the URI is taken from the
   <code>data</code> attribute. -->
-   Relative URIs must be resolved relative to the <a href="#elements3"
-   title="element's base URI">base URI</a> of the image element.
+   Relative URIs must be resolved relative to the <span title="element's base
+   URI">base URI</span> of the image element.
    <!-- If it is an <code>svg</code> element then
   the URI is formed by taking the URI of the document and appending a
   "#" (U+0023 NUMBER SIGN) and the ID of the element.-->
@@ -28467,10 +28627,10 @@
   <p>The <a href="#icon1" title=command-facet-Icon>Icon</a> for the command
    is the absolute URI resulting from resolving the value of the element's
    <code title=attr-command-icon><a href="#icon">icon</a></code> attribute as
-   a URI relative to the <a href="#elements3">element's base URI</a>. If the
-   element has no <code title=attr-command-icon><a
-   href="#icon">icon</a></code> attribute then the command has no <a
-   href="#icon1" title=command-facet-Icon>Icon</a>.
+   a URI relative to the <span>element's base URI</span>. If the element has
+   no <code title=attr-command-icon><a href="#icon">icon</a></code> attribute
+   then the command has no <a href="#icon1"
+   title=command-facet-Icon>Icon</a>.
 
   <p>The <a href="#hidden1" title=command-facet-HiddenState>Hidden State</a>
    of the command is true (hidden) if the element has a <code
@@ -29731,13 +29891,12 @@
    name</dfn>. By default, a browsing context has no name (its name is not
    set).
 
-  <p>A <dfn id=valid10>valid browsing context name</dfn> is any string with
-   at least one character that does not start with a U+005F LOW LINE
-   character. (Names starting with an underscore are reserved for special
-   keywords.)
+  <p>A <dfn id=valid9>valid browsing context name</dfn> is any string with at
+   least one character that does not start with a U+005F LOW LINE character.
+   (Names starting with an underscore are reserved for special keywords.)
 
-  <p>A <dfn id=valid11>valid browsing context name or keyword</dfn> is any
-   string that is either a <a href="#valid10">valid browsing context name</a>
+  <p>A <dfn id=valid10>valid browsing context name or keyword</dfn> is any
+   string that is either a <a href="#valid9">valid browsing context name</a>
    or that case-insensitively <!-- ASCII --> matches one of: <code
    title="">_blank</code>, <code title="">_self</code>, <code
    title="">_parent</code>, or <code title="">_top</code>.
@@ -30037,7 +30196,7 @@
 
   <p>The second argument, <var title="">target</var>, specifies the <a
    href="#browsing2" title="browsing context name">name</a> of the browsing
-   context that is to be navigated. It must be a <a href="#valid11">valid
+   context that is to be navigated. It must be a <a href="#valid10">valid
    browsing context name or keyword</a>. If fewer than two arguments are
    provided, then the <var title="">name</var> argument defaults to the value
    "<code>_blank</code>".
@@ -32314,7 +32473,7 @@
 
   <p>URIs in manifests must not have fragment identifiers.
 
-  <h5 id=parsing1><span class=secno>5.7.3.2. </span>Parsing cache manifests</h5>
+  <h5 id=parsing0><span class=secno>5.7.3.2. </span>Parsing cache manifests</h5>
 
   <p>When a user agent is to <dfn id=parse0>parse a manifest</dfn>, it means
    that the user agent must run the following steps:
@@ -33968,8 +34127,8 @@
   <p>Relative <var title="">url</var> arguments for <code
    title=dom-location-assign><a href="#assign">assign()</a></code> and <code
    title=dom-location-replace><a href="#replace">replace()</a></code> must be
-   resolved relative to the <a href="#scripts" title="script's base URI">base
-   URI of the script</a> that made the method call.</p>
+   resolved relative to the <span title="script's base URI">base URI of the
+   script</span> that made the method call.</p>
   <!-- XXX what about if
   the base URI is data: or javascript: or about: or something else
   without a way to resolve base URIs? -->
@@ -34443,9 +34602,9 @@
    inserted into the DOM, the user agent must run the <a href="#application2"
    title=concept-appcache-init-with-attribute>application cache selection
    algorithm</a> with the value of that attribute, resolved relative to the
-   <a href="#elements3">element's base URI</a>, as the manifest URI.
-   Otherwise, as soon as the root element is inserted into the DOM, the user
-   agent must run the <a href="#application3"
+   <span>element's base URI</span>, as the manifest URI. Otherwise, as soon
+   as the root element is inserted into the DOM, the user agent must run the
+   <a href="#application3"
    title=concept-appcache-init-no-attribute>application cache selection
    algorithm</a> with no manifest.</p>
   <!-- XXXURL change to URL -->
@@ -36043,7 +36202,7 @@
 
   <p>The <dfn id=target3
    title=attr-hyperlink-target><code>target</code></dfn> attribute, if
-   present, must be a <a href="#valid11">valid browsing context name or
+   present, must be a <a href="#valid10">valid browsing context name or
    keyword</a>. User agents use this name when <a
    href="#following0">following hyperlinks</a>.</p>
   <!-- XXXURL change to URL -->
@@ -36165,11 +36324,11 @@
    title=attr-hyperlink-ping><a href="#ping">ping</a></code> attribute's
    value, <span title="split the string on spaces">split that string on
    spaces</span>, treat each resulting token as a URI (resolving relative
-   URIs according to <a href="#elements3">element's base URI</a><!--
-  XXXURL -->)
-   and then should send a request (as described below) to each of the
-   resulting URIs. This may be done in parallel with the primary request, and
-   is independent of the result of that request.</p>
+   URIs according to <span>element's base URI</span><!--
+  XXXURL -->) and
+   then should send a request (as described below) to each of the resulting
+   URIs. This may be done in parallel with the primary request, and is
+   independent of the result of that request.</p>
   <!-- XXXURL change to URL -->
 
   <p>User agents should allow the user to adjust this behavior, for example
@@ -36900,7 +37059,7 @@
    href="#unordered">unordered set of unique space-separated tokens</a>. The
    values must all be either <code title=attr-link-sizes-any><a
    href="#any">any</a></code> or a value that consists of two <a
-   href="#valid1" title="valid non-negative integer">valid non-negative
+   href="#valid0" title="valid non-negative integer">valid non-negative
    integers</a> that do not have a leading U+0030 DIGIT ZERO (0) character
    and that are separated by a single U+0078 LATIN SMALL LETTER X character.
 
@@ -37675,7 +37834,7 @@
 
   <p>The <code title=attr-tabindex><a href="#tabindex">tabindex</a></code>
    attribute, if specified, must have a value that is a <a
-   href="#valid2">valid integer</a>.
+   href="#valid1">valid integer</a>.
 
   <p>If the attribute is specified, it must be parsed using the <a
    href="#rules0">rules for parsing integers</a>. The attribute's values have
@@ -40659,7 +40818,7 @@
 
   <p>For non-HTTP protocols, UAs should act in equivalent ways.
 
-  <h4 id=parsing2><span class=secno>7.2.3 </span>Parsing an event stream</h4>
+  <h4 id=parsing1><span class=secno>7.2.3 </span>Parsing an event stream</h4>
 
   <p>This event stream format's MIME type is <code>text/event-stream</code>.
 
@@ -41963,7 +42122,7 @@
     and <a href="#space" title="space character">space characters</a>.
 
    <li>The root element, in the form of an <code><a
-    href="#html">html</a></code> <a href="#elements4"
+    href="#html">html</a></code> <a href="#elements3"
     title=syntax-elements>element</a>.
 
    <li>Any number of <a href="#comments0" title=syntax-comments>comments</a>
@@ -42060,7 +42219,7 @@
 
   <h4 id=elements1><span class=secno>9.1.2 </span>Elements</h4>
 
-  <p>There are five different kinds of <dfn id=elements4
+  <p>There are five different kinds of <dfn id=elements3
    title=syntax-elements>elements</dfn>: void elements, CDATA elements,
    RCDATA elements, foreign elements, and normal elements.
 
@@ -42145,7 +42304,7 @@
    is <em>not</em> marked as self-closing can have <a href="#text2"
    title=syntax-text>text</a>, <a href="#character3"
    title=syntax-charref>character references</a>, <a href="#cdata0"
-   title=syntax-cdata>CDATA blocks</a>, other <a href="#elements4"
+   title=syntax-cdata>CDATA blocks</a>, other <a href="#elements3"
    title=syntax-elements>elements</a>, and <a href="#comments0"
    title=syntax-comments>comments</a>, but the text must not contain the
    character U+003C LESS-THAN SIGN (<code><</code>) or an <a
@@ -42154,7 +42313,7 @@
 
   <p>Normal elements can have <a href="#text2" title=syntax-text>text</a>, <a
    href="#character3" title=syntax-charref>character references</a>, other <a
-   href="#elements4" title=syntax-elements>elements</a>, and <a
+   href="#elements3" title=syntax-elements>elements</a>, and <a
    href="#comments0" title=syntax-comments>comments</a>, but the text must
    not contain the character U+003C LESS-THAN SIGN (<code><</code>) or an
    <a href="#ambiguous" title=syntax-ambiguous-ampersand>ambiguous
@@ -49811,7 +49970,7 @@
    element's <span title=syntax-start-tag>start tag</span> would imply the
    end tag for the <code><a href="#p">p</a></code>).
 
-  <h3 id=parsing3><span class=secno>9.5 </span>Parsing HTML fragments</h3>
+  <h3 id=parsing2><span class=secno>9.5 </span>Parsing HTML fragments</h3>
 
   <p>The following steps form the <dfn id=html-fragment0>HTML fragment
    parsing algorithm</dfn>. The algorithm takes as input a DOM

Modified: source
===================================================================
--- source	2008-06-24 01:11:35 UTC (rev 1794)
+++ source	2008-06-24 21:21:46 UTC (rev 1795)
@@ -631,25 +631,6 @@
 
    </dd>
 
-   <dt>XML Base</dt>
-
-   <dd>
-
-    <!-- XXXURL remove entire entry, define it all in URLs section -->
-    <p id="xmlBase">User agents must follow the rules given by XML
-    Base to resolve relative URIs in HTML and XHTML fragments. That
-    is the mechanism used in this specification for resolving relative
-    URIs in DOM trees. <a href="#refsXMLBASE">[XMLBASE]</a></p>
-
-    <p class="note">It is possible for <code
-    title="attr-xml-base">xml:base</code> attributes to be present
-    even in HTML fragments, as such attributes can be added
-    dynamically using script. (Such scripts would not be conforming,
-    however, as <code title="attr-xml-base">xml:base</code> attributes
-    as not allowed in <span>HTML documents</span>.)</p>
-
-   </dd>
-
    <dt>DOM</dt>
 
    <dd>
@@ -937,6 +918,12 @@
 
   <h3>URLs</h3><!-- XXXURL -->
 
+  <p>This specification defines the term <span>URL</span>, and defines
+  various algorithms for dealing with URLs, because for historical
+  reasons the rules defined by the URI and IRI specifications are not
+  a complete description of what HTML user agents need to implement to
+  be compatible with Web content.</p>
+
   <p class="big-issue">The text in this section is not yet integrated
   with the rest of the specification.</p>
 
@@ -972,33 +959,273 @@
 
   </ul>
 
-  <p>A <span>URL</span> is a <dfn>valid absolute URL</dfn> if it is a
-  <span>valid URL</span> and has an absolute form. <a
-  href="#refsRFC3986">[RFC3986]</a> <a
-  href="#refsRFC3987">[RFC3987]</a></p>
 
+  <h4>Resolving URLs</h4>
 
-  <h4>Parsing URLs</h4>
+  <p>Relative URLs are resolved relative to a base URL. The <dfn>base
+  URL</dfn> of a <span>URL</span> is the <span>absolute URL</span>
+  obtained as follows:</p>
 
-  <p class="big-issue">...</p>
+  <dl class="switch">
 
-  <h4>Resolving URLs</h4>
+   <dt>If the URL to be resolved was passed to an API</dt>
 
-  <div class="big-issue">
-   <p>First parse it (we need to define that. For some schemes it's
-   not per spec -- e.g. apparently for ftp: we should split from
-   hosts on ';'). Then handle each bit as follows:</p>
-   <p>scheme: no further processing (treat %-escaped characters literally, treat unicode characters as unicode characters).</p>
-   <p>host: expand %-encoded bytes to Unicode as UTF-8, treat unicode characters as per IDN.</p>
-   <p>path: don't expand %-encoded bytes. Re-encode unicode to UTF-8 and percent-encode.</p>
-   <p>query: don't expand %-encoded bytes. Re-encode unicode to the page's encoding. Do not percent-encode.</p>
-  </div>
+   <dd><p>The base URL is the <span>document base URL</span> of the
+   script's <span>script document context</span>.</p></dd>
 
-  <p class="big-issue">define what it means to resolve a relative URL
-  when the base URL doesn't have a path hierarchy (e.g. data:,
-  javascript:, about:blank URLs)</p>
+   <dt>If the URL to be resolved is from the value of a content
+   attribute</dt>
 
+   <dd>
 
+    <p>The base URL is the <i>base URI of the element</i> that the
+    attribute is on, as defined by the XML Base specification, with
+    <i>the base URI of the document entity</i> being defined as the
+    <span>document base URL</span> of the <code>Document</code> that
+    owns the element.</p>
+
+    <p>For the purposes of the XML Base specification, user agents
+    must act as if all <code>Document</code> objects represented XML
+    documents.</p>
+
+    <p class="note">It is possible for <code
+    title="attr-xml-base">xml:base</code> attributes to be present
+    even in HTML fragments, as such attributes can be added
+    dynamically using script. (Such scripts would not be conforming,
+    however, as <code title="attr-xml-base">xml:base</code> attributes
+    are not allowed in <span>HTML documents</span>.)</p>
+
+   </dd>
+
+  </dl>
+
+  <p>The <dfn>document base URL</dfn> of a <code>Document</code> is
+  the <span>absolute URL</span> obtained by running these steps:</p>
+
+  <ol>
+
+   <li><p>If there is no <code>base</code> element that is both a
+   child of <span>the <code>head</code> element</span> and has an
+   <code title="att-base-href">href</code> attribute, then the
+   <span>document base URL</span> is <span>the document's
+   address</span><!-- XXXDOCURL -->.</p></li>
+
+   <li><p>Otherwise, let <var title="">url</var> be the value of the
+   <code title="att-base-href">href</code> attribute of the first such
+   element.</p></li>
+
+   <li><p><span title="resolve a URL">Resolve</span> the <var
+   title="">url</var> URL, using <span>the document's
+   address</span><!-- XXXDOCURL --> as the <span>document base
+   URL</span>.</p></li>
+
+   <li><p>The <span>document base URL</span> is the result of the
+   previous step if it was successful; otherwise it is <span>the
+   document's address</span><!-- XXXDOCURL -->.</p></li>
+
+  </ol>
+
+  <p>To <dfn>resolve a URL</dfn> to an <span>absolute URL</span> the
+  user agent must use the following steps. Resolving a URL can result
+  in an error, in which case the URL is not resolvable.</p>
+
+  <ol>
+
+   <li><p>Let <var title="">url</var> be the <span>URL</span> being
+   resolved.</p></li>
+
+   <li><p>Let <var title="">document</var> be the
+   <code>Document</code> associated with <var
+   title="">url</var>.</p></li>
+
+   <li><p>Let <var title="">encoding</var> be the <span
+   title="document's character encoding">character encoding</span> of
+   <var title="">document</var>.</p></li>
+
+   <li><p>Let <var title="">base</var> be the <span>base URL</span>
+   for <var title="">url</var>. (This is an <span>absolute
+   URL</span>.)</p></li>
+
+   <li><p>Strip leading and trailing <span title="space
+   character">space characters</span> from <var
+   title="">url</var>.</p></li>
+
+   <li>
+
+    <p>Parse <var title="">url</var> in the manner defined by RFC
+    3986, with the following exceptions:</p>
+
+    <ul>
+
+     <li>Add all characters with codepoints less than or equal to
+     U+0020 or greater than or equal to U+007F to the
+     <unreserved> production.</li>
+
+     <li>Add the characters U+0022, U+003C, U+003E, U+005B .. U+005E,
+     U+0060, and U+007B .. U+007D to the <unreserved>
+     production.
+      <!--
+       0022 QUOTATION MARK
+       003C LESS-THAN SIGN
+       003E GREATER-THAN SIGN
+       005B LEFT SQUARE BRACKET
+       005C REVERSE SOLIDUS
+       005D RIGHT SQUARE BRACKET
+       005E CIRCUMFLEX ACCENT
+       0060 GRAVE ACCENT
+       007B LEFT CURLY BRACKET
+       007C VERTICAL LINE
+       007D RIGHT CURLY BRACKET
+      -->
+     </li>
+
+     <li>Add a single U+0025 PERCENT SIGN character as a second
+     alternative way of matching the <pct-encoded> production,
+     except when the <pct-encoded> is used in the
+     <reg-name> production.</li>
+
+     <li>Add the U+0023 NUMBER SIGN character to the characters
+     allowed in the <fragment> production.</li>
+
+     <!-- some browsers also have other differences, e.g. Mozilla
+     seems to treat ";" as if it was not in sub-delims, if the scheem
+     is "ftp". -->
+
+    </ul>
+
+    <p>If <var title="">url</var> doesn't match the
+    <URI-reference> production, even after the above changes are
+    made to the ABNF definitions, then return an error and abort these
+    steps.</p>
+
+    <p>If parsing <var title="">url</var> was successful, then make a
+    note of which substrings of <var title="">url</var> matched each
+    of the following productions that was matched:</p>
+
+    <ul class="brief">
+     <li><host></li>
+     <li><path-abempty></li>
+     <li><path-absolute></li>
+     <li><path-noscheme></li>
+     <li><path-rootless></li>
+     <li><path-empty></li>
+     <li><query></li>
+    </ul>
+
+    <p>When subsequent steps refer to the <path> production,
+    they are referring to whichever of those productions whose names
+    start with "path-" was matched. (Only one at a time can be
+    matched.)</p>
+
+   </li>
+
+   <li>
+
+    <p>If parsing <var title="">url</var> resulted in the <host>
+    production being matched, then replace the matching subtring of
+    <var title="">url</var> with the string that results from
+    expanding any sequences of percent-encoded octets in that
+    component that are valid UTF-8 sequences into Unicode characters
+    as defined by UTF-8.</p>
+
+    <p>If any percent-encoded octets in that component are not valid
+    UTF-8 sequences, then return an error and abort these steps.</p>
+
+    <p>Apply the IDNA ToASCII algorithm to the matching substring,
+    with both the AllowUnassigned and UseSTD3ASCIIRules flags
+    set. Replace the matching substring with the result of the ToASCII
+    algorithm.</p>
+
+    <p>If ToASCII fails to convert one of the components of the
+    string, e.g. because it is too long or because it contains invalid
+    characters, then return an error and abort these steps. <a
+    href="#refsRFC3490">[RFC3490]</a></p>
+
+   </li>
+
+   <li>
+
+    <p>If parsing <var title="">url</var> resulted in the <path>
+    production being matched, then replace the matching substring of
+    <var title="">url</var> with the string that results from applying
+    the following steps to each character that doesn't match the
+    original <path> production defined in RFC 3986:</p>
+
+    <ol>
+
+     <li>Encode the character into a sequence of octets as defined by
+     UTF-8.</li>
+
+     <li>Replace the character with the percent-encoded form of those
+     octets.</li>
+
+    </ol>
+
+    <div class="example">
+
+     <p>For instance if <var title="">url</var> was "<code
+    title="">//example.com/a^b&#x263a;c%FFd/?e</code>", then there
+    would the substring matching the <path> production would be
+    "<code title="">/a^b&#x263a;c%FFd/</code>" and the two characters
+    that would have to be escaped would be "<code title="">^</code>"
+    and "<code title="">&#x263a;</code>". The result after this step
+    was applied would therefore be that <var title="">url</var> now
+    had the value "<code
+    title="">//example.com/a%5Eb%E2%98%BAc%FFd/?e</code>".</p>
+
+    </div>
+
+   </li>
+
+   <li>
+
+    <p>If parsing <var title="">url</var> resulted in the <query>
+    production being matched, then replace the matching substring of
+    <var title="">url</var> with the string that results from applying
+    the following steps to each character that doesn't match the
+    original <query> production defined in RFC 3986:</p>
+
+    <ol>
+
+     <li>Encode the character into a sequence of octets as defined by
+     the encoding <var title="">encoding</var>.</li>
+
+     <li>Replace the character with the percent-encoded form of those
+     octets.</li>
+
+    </ol>
+
+   </li>
+
+   <li><p>Apply the algorithm described in RFC 3986 section 5.2
+   Relative Resolution, using <var title="">url</var> as the
+   potentially relative URI reference (<var title="">R</var>), and
+   <var title="">base</var> as the base URI (<var
+   title="">Base</var>).</p></li>
+
+   <li>
+
+    <p>Apply any relevant conformance criteria of RFC 3986 and RFC
+    3987, returning an error and aborting these steps if
+    appropriate.</p>
+
+    <p class="example">For instance, if an absolute URI that would be
+    returned by the above algorithm violates the restrictions specific
+    to its scheme, e.g. a <code title="">data:</code> URI using the
+    "<code title="">//</code>" naming authority syntax, then user
+    agents are to treat this as an error instead.<!-- RFC 3986, 3.1
+    Scheme --></p>
+
+   <li><p>Return the target URI (<var title="">T</var>) returned by
+   the Relative Resolution algorithm.</p></li>
+
+  </ol>
+
+  <p>A <span>URL</span> is an <dfn>absolute URL</dfn> if <span
+  title="resolve a URL">resolving</span> it results in the same
+  URL without an error.</p>
+
+
   <h4>Open issues</h4>
 
   <div class="big-issue">
@@ -1013,16 +1240,6 @@
     <li>get rid of references to <a href="#refsRFC3986">[RFC3986]</a>
     <a href="#refsRFC3987">[RFC3987]</a> outside this section</li>
 
-    <li>define how to resolve relative URLs in markup attributes
-    (using XMLBase as defined elsewhere right now)</li>
-
-    <li>define how to resolve relative URLs in APIs, using the
-    <dfn>script's base URI</dfn> maybe</li>
-
-    <li>define "an <dfn>element's base URI</dfn>" and make the various
-    places that talk about a base URI in the context of an element use
-    that definition</li>
-
     <li>make the language used to refer to resolving a base URI
     consistent throughout, maybe make it hyperlink to a definition
     each time</li>
@@ -6780,28 +6997,6 @@
   the <code>html</code> element and its <code
   title="attr-html-manifest">manifest</code> attribute).</p>
 
-  <!-- XXXURL move to URLs section -->
-  <p>User agents must use the value of the <code
-  title="att-base-href">href</code> attribute of the first
-  <code>base</code> element that is both a child of <span>the
-  <code>head</code> element</span> and has an <code
-  title="att-base-href">href</code> attribute, if there is such an
-  element, as the document entity's base URI for the purposes of
-  section 5.1.1 of RFC 3986 ("Establishing a Base URI": "Base URI
-  Embedded in Content"). This base URI from RFC 3986 is referred to by
-  the algorithm given in XML Base, which <a href="#xmlBase">is a
-  normative part of this specification</a>. <a
-  href="#refsRFC3986">[RFC3986]</a></p>
-
-  <!-- XXXURL move to URLs section -->
-  <p>If the base URI given by this attribute is a relative URI, it
-  must be resolved relative to the higher-level base URIs (i.e. the
-  base URI from the encapsulating entity or the URI used to retrieve
-  the entity) to obtain an absolute base URI. All <code
-  title="attr-xml-base">xml:base</code> attributes must be ignored
-  when resolving relative URIs in this <code
-  title="attr-base-href">href</code> attribute.</p>
-
   <!-- XXXURL leave this here, but make it clearer -->
   <p class="note">If there are multiple <code>base</code> elements
   with <code title="att-base-href">href</code> attributes, all but the




More information about the Commit-Watchers mailing list