[whatwg] The problem of duplicate ID as a security issue
ian at hixie.ch
Wed Jun 6 15:20:18 PDT 2007
On Fri, 10 Mar 2006, Alexey Feldgendler wrote:
> Does the current version of the spec define what happens to elements
> with duplicate ID values?
No. It's something we should consider for fixes to DOM3 Core, though.
> The problem of duplicate ID isn't just another issue where it's nice to
> have some well-defined error recovery just for uniformity. There are
> cases when duplicate IDs should be viewed as a security concern.
> Consider a script which augments the HTML page after it has been parsed
> by attaching event listeners to elements in the DOM tree, inserting new
> nodes into the tree etc. This is common practice, for example, for many
> web-based WYSIWYG editors. In this scenario, any method the script uses
> for identificaation of the DOM nodes subject to augmentation is
> vulnerable to possible spoofing by user-supplied content present on the
> same page.
> For example, imagine a script which finds a button by ID and attaches an
> event listener to it. A possible markup looks like this:
> ...blog entry body...
> <button id="addtomemories">Add this entry to memories</button>
> So, a malicious blog author can make the following entry:
> I have found a <a href="#" id="addtomemories">cool website</a>.
> Depending on how the browser handles duplicate IDs, any of the following
> unwanted effects may occur, or both:
> 1. Clicking the link in the blog entry adds the entry to memories list
> of the reader.
> 2. Clicking the real "Add this entry to memories" button does nothing.
> One can think of other examples, possibly more dangerous. Other methods
> of identification (by tag name, by class, by CSS selector as proposed
> recently) are also vulnerable.
> This kind of attack is hard to circumvent through use of HTML cleaners
> because id="addtomemories" looks like an innocent attribute, like an
> anchor for navigation.
It's not that hard to avoid. You can whitelist what attributes are allowed
(e.g. only attribute consisting of "comment" followed by the comment
number followed by 1 to 10 characters in the range a-z).
> Preventing such attacks by a HTML cleaner would require either making a
> full list of all "forbidden" IDs, class names etc, or imposing Draconian
> rules upon user-supplied content, completely disallowing such useful
> attributes like id and class.
I'm not really convinced there's that much use in user-supplied IDs and
classes, but the rules needn't be that draconian. The server could
automatically prepend the commentN string to IDs and classes.
To be safe, a server's cleaning code must whitelist everything --
elements, attribute names, attribute values, element contents, etc. It's
not trivial, but that's no excuse for not doing it.
> Necessary but not sufficient. Duplicate IDs aren't caught by a
> validating parser, so custom code is needed to enforce many of the
> requirements. For example, if one was trying to ensure that all IDs are
> unique, then the ID values within the user-supplied code would have to
> be checked for duplicates among them, too.
This is already the case, yes.
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg