[whatwg] several messages about XML syntax and HTML5

Sam Ruby rubys at intertwingly.net
Wed Dec 6 18:10:41 PST 2006


Ian Hickson wrote:
> On Tue, 5 Dec 2006, Sam Ruby wrote:
>>>>    xmlns attributes are invalid on HTML elements except html, and
>>>>    when found on unrecognized [elements] imply style="display:none"
>>>>    unless you recognize the value of this attribute.
>>> There are millions of documents that would be "broken" by such a rule, 
>>> so browser vendors couldn't actually deploy that, sadly. :-(
>> Can you identify three independently produced ones?
> 
> Sure. Here's one (many pages on that site have this problem):
> 
>    http://forskningsbasen.deff.dk/ddf/rec.external?id=auc107991
> 
> It has a block at the bottom that says:
> 
>    <copyright xmlns="" xml:lang="en">...<br>...<br>...</copyright>
> 
> (Note the cunning mixing of XML-like syntax with HTML-like syntax.)
> 
> 
> Another:
> 
>    http://www.cms.alaswaq.net/save_print.php?save=1&cont_id=4372
> 
> A large chunk of the text on this page is inside elements with xmlns="" 
> set (from what I can tell, all the text above the double up chevron button 
> thing is inside elements with xmlns="").
> 
> 
> A third one:
> 
>    http://www.homeaway.com/Varna/s/1453/fa/find.squery
> 
> This one has markup like this (I can just imagine how this happened):
> 
>    <span>(<?xml version="1.0" encoding="UTF-8"?>
>    <fromRecord xmlns="http://wvrgroup.com/propertyom">1</fromRecord> - 
>    <?xml version="1.0" encoding="UTF-8"?>
>    <toRecord xmlns="http://wvrgroup.com/propertyom">10</toRecord> of <?xml 
>    version="1.0" encoding="UTF-8"?>
>    <hitCount xmlns="http://wvrgroup.com/propertyom">24</hitCount>)</span>
> 
> Again, important text (it's the "(1 - 10 of 24)" text at the top right, 
> clearly intended to be visible), which is wrapped in elements with 
> xmlns="" attributes.
> 
> 
> That's three. I found dozens more (and I only checked a few thousand 
> pages at random), including:
> 
>    http://ise.uvic.ca/Theater/sip/person/7639/main.html
>    The entire header text ("John Epstein") on that page is all inside an
>    element <display_name> which has an xmlns="" attribute.
>    
>    http://global.yesasia.com/kr/artIdxDept.aspx/section-videos/code-c/aid-39826/
>    A bunch of snippets are inside elements with xmlns="".
> 
>    http://intermezzo-weblog.blogspot.com/2005/05/o-caso-rondnia-e-mais.html
>    Not clear if it was intentional here, but some of the visible text at 
>    the bottom right is in an xmlns="" block.
> 
>    http://projects.teknowledge.com/DAML/Corpus/W/wrestling_match.html
>    Unclear what they thought was going on here too, but the text at the 
>    top is inside an unknown element with xmlns="" set.
> 
>    http://194.7.45.68/fr/item.php?text_id=51813&keyw=Snoop+Dogg
>    There are eight bazillion xmlns="" attributes in this file, but the 
>    copyright in particular uses an unknown HTML element with xmlns="".
> 
> ...and I'll stop here, because that should be enough to convince you. :-)

The common pattern that I see is that xmlns="".

- Sam Ruby



More information about the whatwg mailing list