[whatwg] Dealing with UI redress vulnerabilities inherent to the current web

Smylers Smylers at stripey.com
Sat Sep 27 00:53:16 PDT 2008


Elliotte Harold writes:

> People want to get pictures, text, and other media from the web.
> People  want to play games and use some apps. Users don't care where
> the media  is loaded from. If it can be loaded form a single server,
> then the  users' needs are met.
>
> I see no genuine user use cases that require multisite access within a  
> single page.

Well it depends what you mean by "genuine".  I can think of several
scenarios where it currently has to be done that way, but any of them
could be dismissed as "not genuine" if you're prepared to impose
additional (technical, financial, business model, development,
networking) restrictions on folk running websites

> That's a sometimes convenient feature for site developers,  but
> there's nothing you can do with content loaded from two sites you
> can't do with content loaded from one.

Here's some I can think of:

* Many sites are funded by displaying adverts from a third-party service
  which picks appropriate ads for the current user-page combination.

  To work on a single host would require all potential adverts' content
  to be supplied to the website before they can be used -- thereby
  forcing the website authors into regular maintenance of adding to the
  pool of ads, denying them the opportunity to leave their website alone
  and let the income accrue from the third-party ads.
  
  And that the third party be happy to provide the software for picking
  which ad to use in a request, which is probably proprietary -- and
  also gives them the burden of supporting authors using the software
  and issuing updates to the software.

  And that the author's site is running on a server which allows the
  third party software to run; it can no longer be a purely static
  website.

  Further, I don't see how users can be tracked across multiple sites.
  This is useful to serve users a variety of different ads, rather than
  the same one lots of times, even as they read multiple sites which all
  use the same third party ad service.

* Some sites allow users to add comments to pages, with limited HTML
  allowed in the comments.  The permitted HTML can include <img> tags,
  linked to images served elsewhere.

  In the case of comments being provided in an HTML form it would of
  course be possible to develop the software to include the capabililty
  for uploading files with a comment, and only allow <img> tags to link
  to such content.  But that involves the website software (which may
  have been written years ago, by a third party no longer involved in
  the site) being changed.

  And it's conceivable (though I admit, unlikely) that comments could be
  provided by other means, such as text messages, yet still contain HTML
  links to images -- in which case it's unclear how the user could
  upload the image.

* If a popular campaign issues a video, encouraging fans to include it
  on their websites, they currently just need to paste the supplied HTML
  into their site.  Having to download and re-upload the video adds to
  the effort needed to show their support.

* Further, if successful there'll be thousands of different copies of
  this video on the net.  This hampers ISPs' (and even browsers')
  abilities to cache it, in the way that they could if everybody was
  linking to a single instance of it.

* Sometimes such campaigns include countdowns or release schedules to a
  particular event ("10 days to go until ...").  If the iframe or image
  or whatever is hosted by those running the event then they can update
  it accordingly, and it will be correct on all the supporting sites
  kindly linking to it.

  But if the 'fans' have to download each change and re-apply it to
  their own sites, many will likely get out of date.

  Or, similarly, an image linking to a specific book on a bookseller's
  website -- which the bookseller ensures always contains the current
  price.

* Third party traffic analysis services, ranging from simple image hit-
  counters to something like Google Analytics, require being part of a
  page's loading.

  Of course, hit counters are trivial to code, but require dynamic
  hosting accounts.  And the third parties performing the advanced
  analysis are unlikely to provide their server-side code.  Even if they
  did, I guess both it and the embedded JavaScript undergo frequent
  revisions, which currently authors can ignore.

* The copyright owners of some media may be happy for others to embed
  it, but not to copy it and host it elsewhere.  For example, because
  they want to make it available for only a limited period, or so they
  can count how many hits are served.

  So it would be illegal for an author to copy it and serve it directly.

* Some HTML mail contains links to images hosted on the sender's
  website.  This isn't really third-party content, but by the time I'm
  reading the message, it's a local file: URL so all images are
  external.

  Web-based mail readers would suffer similarly on such messages.

  Images can be embedded in HTML messages, but that doesn't always work
  well, and it can significantly increase the size of the messages
  (which can in turn thwart the sender's ability to mail all of its
  subscribers in a timely fashion).

* A Google Cache view of a webpage can be useful.  It links to images
  and style-sheets on the live website, hence on a different host.
  Clearly Google could also cache all the media, but why should they
  have to?  The service is useful as it is.

* Many large sites serve images from a different domain from other
  content, or from multiple different domains, to bypass browser limits
  on multiple simultaneous connections to a single host.
  
  Forcing all the media to a single host would make such sites take
  longer to load.  Getting rid of the browser throttling risks browsers
  overpowering smaller sites which can't cope with so many connections.

* Pages that just happen to link to images on other sites, for no
  particularly good reason, would break.  Such sites could be
  re-implemented not to do this (without suffering from any of the above
  problems).

  The only problem is that they would have to be re-implemented.  Many
  webpages have been abandoned years ago, yet they still have utility.
  Given that images are a basic feature which people have been relying
  on for so long, this would break many, many sites.  It isn't
  reasonable to expect them all to undergo active maintenance.

> I challenge anyone to demonstrate a single multisite web page that
> cannot be reproduced as a single-site page. Do not confuse details of
> implementation with necessity. Just because we sometimes put images,
> ads, video, tracking scripts, and such on different sites doesn't mean
> we have to. The web would be far more secure if we locked this down,
> and simply made multisite pages impossible.

I agree it would be more secure.  But I don't see how we get there from
here, even over multiple years.

The first browser to implement such a restriction would break so many
sites that its users would all switch to a browser that kept the web
working as it has till now.

And, knowing that, why would website authors bother to make the first
move?

Smylers



More information about the whatwg mailing list