[whatwg] Citing multiple <blockquote> elements in HTML5

Tue Dec 2 12:07:48 PST 2008

Ian Hickson ha scritto:
> On Mon, 1 Dec 2008, Calogero Alex Baldacchino wrote:
>   
>>> Yes, a hash link (<a href="#foo">) will scroll to the element with an 
>>> id=foo.  If coding properly, you'll virtually *never* use an <a> for 
>>> an actual *anchor*, but rather will target the most semantically 
>>> appropriate element, such as a heading or a container with the 
>>> appropriate @id.
>>>       
>> Thanks! That's what I was missing in the specicification (I should give it a
>> more accurate reading). Does it applies to every element, covering the <cite>
>> element too?
>>     
>
> See:
>    http://www.whatwg.org/specs/web-apps/current-work/#scroll-to-fragid
>
> Let me know if that doesn't address your use case.
>
> Cheers,
>   
Indeed it does, and I found such behaviour more consistent than letting 
just the a element with a 'name' or an 'id' being an anchor for 
navigating to a fragment :-)

However, now I have a question. The 3rd step of the algorithm to 
determine "the indicated part of the document" says,

"If there is an element in the DOM that has an ID exactly equal to 
/fragid/, then the first such element in tree order is the indicated 
part of the document; stop the algorithm here."

Shouldn't the id be unique in the whole document? Section 3.3.3.2 says,

"The||| id |attribute represents its element's unique identifier. The 
value must be unique in the subtree within which the element finds 
itself and must contain at least one character. The value must not 
contain any space characters."

then follows,

"If the value is not the empty string, user agents must associate the 
element with the given value (exactly, including any space characters) 
[...]"

First of all, isn't it a bit conflicting? Space characters are legal or 
not? If not, perhaps that might say "discarding any space characters" 
for graceful degradation, or, at the beginning of the paragraph, "If the 
value is not the empty string and does not contain any space characters, 
[...]" if such an id is illegal (for graceful degradation sake, when 
more than one token may be created by skipping any space character, 
either the first token might be chosen, or each token could represent a 
different id, but the latter would require an explicit dealing with 
multiple ids the same way multiple classes are dealt with...). The rest 
of the paragraph says,

"for the purposes of ID matching within the subtree the element finds 
itself (e.g. for selectors in CSS or for the |getElementById()| method 
in the DOM)."

I guess the above covers, for instance, the case of a document holding 
an element with id="foo" and an iframe whose content document holds 
another element with the very same id; but speaking about subtrees might 
suggest the following is legal:

<body>
<div><p id="foo">something</p><p>something else</p></div>
<div><p>something else from <cite id="foo">Whatever Example</cite></p></div>
</body>

since we can separate two different subtrees where the id 'foo' is 
unique. Perhaps that could be true for CSS selectors isolating the 
proper subtree (honestly, I don't remember if actually that's legal in 
CSS, though I've always thought it isn't), but might conflict with the 
DOM, because the method 'getElementById' is defined only for the 
Document interface and from this point of view both elements stay in the 
same document subtree, consisting of the whole document tree. About such 
a case, DOM level 3 Core says, "If more than one element has an ID 
attribute with that value, what is returned is undefined."; as a 
consequence, if the desired behaviour were to select the first matching 
id (for consistence with the use of the first matching id as a fragment 
identifier for HTML documents), or anyway to establish a well defined 
behavior in the case of more than one element with the very same id (I 
don't think we should leave the choice of what to do to the 
implementation, because I don't think we want every browsers potentially 
to deal with clashing ids in a different, browser specific manner), I 
suppose the 'getElementById' method should be redefined accordingly; but 
such can't be done at the level of the 'Document' interface untill 
eventually a 4th specification for its core interfaces, which is out of 
HTML 5 scope.

A solution would be adding 'getElementById' to the HTMLDocument 
interface, but such might be a trouble, since HTMLDocument no more 
inherits from Document, so I can see two possible scenarios. In the 
worst one, a user agent is implemented in a language not supporting 
multiple inerithance, so either the two methods should be implemented in 
the same object with different names (this is bad to expose the 
interface for bindings to script languages supporting inerithance and 
function override), or two different objects should be created, one to 
deal with HTML documents, the other for generic (i.e. xml) documents 
(this is bad in general); in both cases, the above means doubling the 
code and the maintenance needs. In the other scenario, multiple 
inheritance helps us, yet two methods must be defined (but this is a 
minor concern here), and some special care is needed when casting to the 
proper interface to use the proper method and avoid side-effects and 
unwanted behaviours, because without a 'direct' lineage of inheritance 
we cannot avail of polymorphism and late binding without a proper 
casting (this is the main issue with multiple inheritance when it comes 
with doubled fields and methods: everything works, whatever you do, but 
perhaps in the wrong way).

Furthermore, I think any pathways reaching an element, whether through 
CSS or through DOM, must be consistent with each other (I consider this 
as the best way to limit the likelihood of side effects), so if the 
above code is good for CSS selectors capable to target the proper 
subtrees individually, that should be good for the DOM as well, meaning 
not only that 'document.getElementById("foo")' should have a well 
defined behaviour (i.e. picking the first element with a matching id), 
but also that the same subtree individuation and consequent element 
targeting should be possible through the DOM too, that is, there should 
also be a 'getElementById' method in the HTMLElement interface, 
searching for a matching id in the subtree having a given element as its 
root node.

Otherwise, just let the id attribute be unique in the whole document, 
label any duplicate one as illegal and treat it as the empty string, so 
that one only method is enough and the DOM 3 undefined behaviour for 
'getElementById' is no more problematic, being fired by non-allowed DOM 
structures (as don't care conditions). Such would be the easiest choice, 
although there might be any good reason to prefer allowing replicated 
ids inside the same document.

BR, Alex.

---
P.S. The [SELECTORS] reference ( 
http://www.whatwg.org/specs/web-apps/current-work/#refsSELECTORS ) seems 
to be a broken link.

 --
 Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP autenticato? GRATIS solo con Email.it http://www.email.it/f

 Sponsor:
 Polizza auto?
* Garanzia furto e incendio per un anno al vantaggioso prezzo di 30 euro tasse incluse!
* Scopri subito l'offerta!
* 
 Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8425&d=2-12