[whatwg] Inconsistent behavior for empty-string URLs

Mon Dec 7 11:44:29 PST 2009

On Mon, Dec 7, 2009 at 1:51 PM, Nicholas Zakas <nzakas at yahoo-inc.com> wrote:
> Presently, HTML5 does provide guidance on the correct behavior for <img
> src=””> in section 4.8.2, indicating that Firefox 3.5’s and Opera 10’s
> behavior in this regard is correct:
>
> “If the base URI of the element is the same as the document’s address, then
> the src attribute’s value must not be the empty string.”

That says that if it's the empty string, the document is invalid.  It
doesn't say what the UA has to do.  The relevant part is:

[[
Unless . . . the element's src attribute has a value that is an
ignored self-reference, then, when an img is created with a src
attribute, and whenever the src attribute is set subsequently, the
user agent must resolve the value of that attribute, relative to the
element, and if that is successful must then fetch that resource. . .
.

The src attribute's value is an ignored self-reference if its value is
the empty string, and the base URI of the element is the same as the
document's address.
]]

This implies user agents don't need to resolve the src or fetch the
element if the src is empty (unless the base URI is non-default).  I
don't think they're prohibited from doing so, since there's no
detectable difference to their user-visible output -- likewise they
might fetch resources speculatively even if not explicitly required
to.  It's kind of pointless, though.

The other cases seem to make no specific exception for an empty URL,
so as far as I can tell, the UA must fetch them as usual -- although
of course it might have a valid copy in the cache.

This is clearly not a good idea for <iframe>, since otherwise <iframe
src=""> is an instant infinite loop on a typical page.  The same goes
for a URL that consists only of a fragment.  In fact, a quick test in
the browsers I had handy (Firefox 3.5 and Opera 9.22) suggests that
there are more elaborate protections against recursion here.  Try
saving these two files in the same directory with the names
"test1.html" and "test2.html", and viewing test1.html in a web
browser:

<!doctype html>
<p>1</p>
<iframe src=test2.html>

<!doctype html>
<p>2</p>
<iframe src=test1.html>

Neither browser I tested with has an infinite loop here, although they
terminate at different steps: Firefox displays each page only once
(visible text is 1 2), while Opera displays test1.html twice (1 2 1).
Is this covered by the spec anywhere?

I'm not sure it makes a difference whether <script src=""></script> or
<link rel=stylesheet href=""> does anything special.  It seems simpler
to just leave them as-is, so they fetch the resource again (or
retrieve it from cache if possible) and then probably throw it out as
invalid (since it's HTML and not CSS/JS/etc.).

> I’m interested in what others’ opinions on this may be, as this seems like
> an important area in which to gain consistency.

Why?  It seems like fairly unlikely markup.  Consistency is good, but
I wouldn't call this point "important".