[whatwg] several messages about XML syntax and HTML5

Ian Hickson ian at hixie.ch
Wed Dec 6 17:14:48 PST 2006

On Tue, 5 Dec 2006, Sam Ruby wrote:
> > >
> > >    xmlns attributes are invalid on HTML elements except html, and
> > >    when found on unrecognized [elements] imply style="display:none"
> > >    unless you recognize the value of this attribute.
> > 
> > There are millions of documents that would be "broken" by such a rule, 
> > so browser vendors couldn't actually deploy that, sadly. :-(
> Can you identify three independently produced ones?

Sure. Here's one (many pages on that site have this problem):


It has a block at the bottom that says:

   <copyright xmlns="" xml:lang="en">...<br>...<br>...</copyright>

(Note the cunning mixing of XML-like syntax with HTML-like syntax.)



A large chunk of the text on this page is inside elements with xmlns="" 
set (from what I can tell, all the text above the double up chevron button 
thing is inside elements with xmlns="").

A third one:


This one has markup like this (I can just imagine how this happened):

   <span>(<?xml version="1.0" encoding="UTF-8"?>
   <fromRecord xmlns="http://wvrgroup.com/propertyom">1</fromRecord> - 
   <?xml version="1.0" encoding="UTF-8"?>
   <toRecord xmlns="http://wvrgroup.com/propertyom">10</toRecord> of <?xml 
   version="1.0" encoding="UTF-8"?>
   <hitCount xmlns="http://wvrgroup.com/propertyom">24</hitCount>)</span>

Again, important text (it's the "(1 - 10 of 24)" text at the top right, 
clearly intended to be visible), which is wrapped in elements with 
xmlns="" attributes.

That's three. I found dozens more (and I only checked a few thousand 
pages at random), including:

   The entire header text ("John Epstein") on that page is all inside an
   element <display_name> which has an xmlns="" attribute.
   A bunch of snippets are inside elements with xmlns="".

   Not clear if it was intentional here, but some of the visible text at 
   the bottom right is in an xmlns="" block.

   Unclear what they thought was going on here too, but the text at the 
   top is inside an unknown element with xmlns="" set.
   There are eight bazillion xmlns="" attributes in this file, but the 
   copyright in particular uses an unknown HTML element with xmlns="".

...and I'll stop here, because that should be enough to convince you. :-)

Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

More information about the whatwg mailing list