See some research here: http://code.google.com/p/html5lib/issues/detail?id=93 It seems like in addition to whitespace and "'=<> , the characters U+0000 through U+0020 should be banned from unquoted attribute values, as well as U+0060 (backtick `), for the sake of compatibility.