[whatwg] Inconsistent behavior for empty-string URLs
Ian Hickson
ian at hixie.ch
Mon Mar 8 16:40:31 PST 2010
On Mon, 7 Dec 2009, Nicholas Zakas wrote:
>
> [...] I found that there are several instances where the browser will
> make a second [request] to the page based on resolving empty-string URLs
> in the several tags.
On Mon, 7 Dec 2009, Aryeh Gregor wrote:
>
> This is clearly not a good idea for <iframe>, since otherwise <iframe
> src=""> is an instant infinite loop on a typical page.
On Tue, 15 Dec 2009, Nicholas Zakas wrote:
>
> Here's what I would propose:
>
> Empty string attributes for HTML elements specifying resources to
> automatically download are considered invalid and don't cause a request
> to be sent.
On Tue, 15 Dec 2009, Jonas Sicking wrote:
>
> I'd prefer to explicitly enumerate the elements we're talking about,
> rather than giving rules which risk being interpreted differently by
> different people. [...]
>
> So the specific list would then be:
>
> <img>
> <link>
> <script>
> <iframe>
> <video>
> <audio>
> <object>
> <embed>
> <source>
> <input type=image>
>
> All of these would never attempt to fetch a resource if the src/href
> attribute is empty (even if the current baseuri is different from the
> document uri). However it would not act as if the attribute was not set
> (important for <script>).
On Tue, 15 Dec 2009, Aryeh Gregor wrote:
>
> I'd say the rule should be that if the type is text/html or unknown, ""
> should work. If it's known to be some other type, like text/css, then
> it should fail. Alternatively, it should work for everything that
> doesn't actually fetch a resource automatically. After all, the
> rationale for this whole change is that "" as a source for images and
> such 1) makes no sense and is almost certainly an authoring mistake, and
> 2) causes extra HTTP requests -- but neither is true for all <link>s.
> For instance, <link rel=first href=""> makes perfect sense and causes no
> extra requests, so I don't think it should be prohibited.
On Tue, 15 Dec 2009, Jonas Sicking wrote:
>
> Interesting. I don't think we want to base it on the type attribute,
> since that should generally be possible to leave out. But I can
> certainly see a use for <link rel=sitemap href="">.
>
> So maybe just apply the don't-download rule rel=stylesheet (and
> rel="stylesheet alternate" etc).
On Wed, 16 Dec 2009, Simon Pieters wrote:
>
> I think only icon, prefetch and stylesheet links.
>
> The following element defines two links, one of which would be ignored:
>
> <link rel="icon index" href>
>
> [<video poster>?]
> <command icon>?
> <html manifest>?
> <applet code>? (Maybe not, since it's more of a parameter to the Java plugin.)
> <frame src>?
On Thu, 17 Dec 2009, Simon Pieters wrote:
>
> I asked Philip to provide some data about pages using empty attributes
> for these:
>
> <Philip`> zcorpan: http://philip.html5.org/data/empty-url-attributes.txt
> <Philip`> zcorpan: http://philip.html5.org/data/empty-url-link-attributes.txt
On Thu, 17 Dec 2009, Nicholas Zakas wrote:
>
> <img src="">
> IE 8 and earlier: makes a request
> FF 3 and earlier: makes a request
> FF 3.5: does not make a request
> Safari 4 and earlier: makes a request
> Chrome 3 and earlier: makes a request
> Opera 10 and earlier: does not make a request
>
> <link href="">
> IE 8 and earlier: does not make a request
> FF 3.5 and earlier: makes a request
> Safari 4 and earlier: makes a request
> Chrome 3 and earlier: makes a request
> Opera 10 and earlier: does not make a request
>
> <script src="">
> IE 8 and earlier: does not make a request
> FF 3.5 and earlier: makes a request
> Safari 4 and earlier: makes a request
> Chrome 3 and earlier: makes a request
> Opera 10 and earlier: does not make a request
>
> <iframe src="">
> IE 8 and earlier: does not make a request
> FF 3.5 and earlier: does not make a request
> Safari 4 and earlier: does not make a request
> Chrome 3 and earlier: does not make a request
> Opera 10 and earlier: does not make a request
On Thu, 17 Dec 2009, Simon Pieters wrote:
>
> Is the result different if the base URL is different from the document's URL?
> Is the result different if the value is "#"?
On Fri, 18 Dec 2009, Simon Pieters wrote:
>
> http://simon.html5.org/dump/empty-url-attributes.xml
>
> <img src>, 3221 occurrences
> <iframe src>, 1862 occurrences
> <body background>, 1665 occurrences
> <script src>, 248 occurrences
> <embed src>, 74 occurrences
> <input src>, 55 occurrences
> <frame src>, 53 occurrences
> <video src>, 0 occurrences
> <video poster>, 0 occurrences
> <audio src>, 0 occurrences
> <object data>, 0 occurrences
> <source src>, 0 occurrences
> <command icon>, 0 occurrences
> <html manifest>, 0 occurrences
> <applet code>, 0 occurrences
>
> http://simon.html5.org/dump/empty-url-link-attributes.xml
>
> <link rel=icon>, 243 occurrences
> <link rel=stylesheet>, 115 occurrences
> <link rel=prefetch>, 0 occurrences
On Fri, 18 Dec 2009, Simon Pieters wrote:
>
> I've now looked at a selection of random URLs.
>
> Conclusion: None of these seem to need a request to be made. img should
> fire an error event. iframe and frame should use about:blank.
On Mon, 21 Dec 2009, Nicholas Zakas wrote:
>
> Here are the results of testing various tags with empty URLs across
> different browsers. The table below indicates how many requests are sent
> when the given tag is encountered on the page (curiously, Firefox 3
> sometimes sends two extra requests). Even though the <link> tags don't
> show it in the table, they all had href="".
>
> IE7 IE8 FF3 FF3.5 SF4 Ch3 Op10
> <img src=""> 1 1 1 0 1 1 0
> <input type="image" src=""> 1 1 1 0 1 1 0
> <object data=""> 0 0 1 1 0 0 0
> <script src=""> 0 0 1 1 1 1 0
> <link rel="stylesheet"> 0 0 1 1 1 1 0
> <link rel="icon"> 0 0 2 1 1 1 0
> <link rel="shortcut icon"> 0 0 2 1 1 1 0
> <link rel="prefetch"> 0 0 2 0 0 0 0
> <iframe src=""> 0 0 0 0 0 0 0
> <embed src=""> 0 0 0 0 0 0 0
> <html manifest=""> 0 0 0 0 1 0 0
>
> For the most part, no two browsers act the same. Safari and Chrome are
> the closest (not surprising).
>
> Apply a base URL via <base> in all cases didn't change the results,
> except in IE, where it prevented the extra image request from being
> made.
On Tue, 22 Dec 2009, Simon Pieters wrote:
>
> Thanks. IIRC, IE doesn't make a request when using minimized attribute
> syntax, i.e. "<img src>" (because it drops the attribute during
> parsing).
On Thu, 7 Jan 2010, Nicholas Zakas wrote:
>
> Given the disparate browser implementations for dealing with empty
> string URLs, it seems unlikely that anyone is relying upon the current
> behaviors, so I'd like to suggest this change be added to HTML5:
>
> For any <img>, <link>, <script>, <iframe>, <audio>, <video>, <audio>,
> <object>, <embed>, <input>, <html manifest>, or <frame> tag that will
> result in an automatic download of an external resource must ignore any
> empty string URL and not download the external resource. This is true
> even when a <base href> is applied to the page.
On Mon, 7 Dec 2009, Jonas Sicking wrote:
>
> Given that the concern is sites that accidentally leave a attribute
> empty, wouldn't you want to prevent a request from going out even if the
> base-uri is set? I.e. wouldn't you want to prevent a request from going
> out for the current document:
>
> foo.html:
> <head><base src="bar.html">
> <body><img src="">
>
> It seems to me equally unlikely that someone would do that
> intentionally expecting a request to be sent to "bar.html"?
On Tue, 8 Dec 2009, Nicholas Zakas wrote:
>
> I'd agree with that, I've yet been able to find an example of someone
> intentionally including an empty-string URL in one of these tags.
Done.
Note that as a side-effect, <link rel=index href=""> is now
non-conforming, although <a rel=index href=""></a> is still ok. I couldn't
find a sane way to work around that.
On Mon, 7 Dec 2009, Aryeh Gregor wrote:
>
> The same goes for a URL that consists only of a fragment. In fact, a
> quick test in the browsers I had handy (Firefox 3.5 and Opera 9.22)
> suggests that there are more elaborate protections against recursion
> here. Try saving these two files in the same directory with the names
> "test1.html" and "test2.html", and viewing test1.html in a web browser:
>
> <!doctype html>
> <p>1</p>
> <iframe src=test2.html>
>
> <!doctype html>
> <p>2</p>
> <iframe src=test1.html>
>
> Neither browser I tested with has an infinite loop here, although they
> terminate at different steps: Firefox displays each page only once
> (visible text is 1 2), while Opera displays test1.html twice (1 2 1). Is
> this covered by the spec anywhere?
This falls into the "hardware limitations" clause.
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg
mailing list