[whatwg] id and xml:id
Henri Sivonen
hsivonen at iki.fi
Sun Apr 2 03:58:46 PDT 2006
Since UAs handle whitespace in the id attribute inconsistently (see
below), old specs imply or require whitespace trimming and ids with
whitespace are unreferencable from whitespace-separated lists of ids,
I suggest adding the following language concerning document conformance:
The value of the id attribute must be a string that consists of one
or more characters matching the following production: [#x21-#xD7FF]|
[#xE000-#xFFFD]|[#x10000-#x10FFFF] (any XML 1.0 character excluding
whitespace).
Also, I suggest requiring that elements must not have both id and
xml:id and requiring that xml:id must not occur in the HTML
serialization. (Again, from the document conformance point of view--
not disputing requirements on browsers.)
Rationale:
HTML doesn't have namespace processing of colonified names and the
xml:id spec is not designed for HTML. Allowing xml:id in HTML feels
intuitively wrong (perhaps even a bit evil :-).
If an element had both an id attribute and an xml:id attribute with
different values, the document would not be HTML-serializable, which
would be bad. (Obviously, even with only one kind of ID attribute on
an element, in round tripping from XHTML to HTML to XHTML, the
information about whether the original attribute was id or xml:id is
lost just like the information about whether a table had a tbody is
lost.)
If an element was allowed to have an id attribute and an xml:id
attribute with the same value, the following constraint from xml:id
spec would be violated even for conforming docs:
"An xml:id processor should assure that the following constraint holds:
* The values of all attributes of type “ID” (which includes all
xml:id attributes) within a document are unique."
( http://www.w3.org/TR/xml-id/ )
Assuming, of course, that the XHTML5 id can still be considered an ID
in the XML sense.
Finally, as the ultimate ID nitpicking, the spec should state that it
is naughty of authors to turn attributes other than id and xml:id
into IDs via the DTD. (Well, using a DTD at all is naughty. :-)
- -
Test case: http://hsivonen.iki.fi/test/wa10/adhoc/id.html
The script tries every id with a whitespaceless value to see if
whitespace is trimmed before ID assignment.
Firefox:
id='a' PASS
id='2' PASS
id='<' PASS
id=',' PASS
id='ä' PASS
id=' c ' FAIL
id='\nd\n' PASS
id='\t\te\t\t' PASS
id='
f
' PASS
Opera (weekly build 3312; note that Opera recently changed its
behavior to match the others with id=' c '):
id='a' PASS
id='2' PASS
id='<' PASS
id=',' PASS
id='ä' PASS
id=' c ' FAIL
id='\nd\n' PASS
id='\t\te\t\t' PASS
id='
f
' FAIL
Safari and IE 6:
id='a' PASS
id='2' PASS
id='<' PASS
id=',' PASS
id='ä' PASS
id=' c ' FAIL
id='\nd\n' FAIL
id='\t\te\t\t' FAIL
id='
f
' FAIL
--
Henri Sivonen
hsivonen at iki.fi
http://hsivonen.iki.fi/
More information about the whatwg
mailing list