[whatwg] Use cases for Node.getElementById

Sat Dec 6 19:09:01 PST 2008

Simon Pieters ha scritto:
> On Fri, 05 Dec 2008 19:19:04 +0100, Calogero Alex Baldacchino 
> <alex.baldacchino at email.it> wrote:
>
>> [...]
>
> (I'm currently the editor of that proposal, currently located at 
> http://simon.html5.org/specs/web-dom-core )
>

I'm reading it :-)

And I have a few questions. First, is it meant as the reference DOM Core 
for HTML 5 only, or in general (for other kinds of markup too)?

The 'children' attribute on the Element interface, being an 
HTMLCollection instance, suggests me the former might be the answer; 
otherwise, either the reference to a specific document DOM interface, or 
(in the case such interface were moved into Web DOM Core) the reference 
to a specific dom in the name of the interface might perhaps be 
problematic (formally, at least). I guess such attribute has been 
declared on the Element interface instead of the HTMLElement one because 
actually this is the most common implementation in current browsers. 
Anyway, let me suggest (just as a hint, after all a working draft is the 
right phase to explore any alternative) something like an 
ElementCollection interface with the same properties of HTMLCollection, 
making the latter just inheriting from the former as if it were an alias 
(the same way DocumentFragment inherits from Node). On any browser 
implementing 'children' as an HTMLCollection (without any hierarchy), 
this shouldn't be a problem for scripts, since a script language usually 
provides runtime inferred types; for languages with strong types (and 
perhaps here we're moving from scripts to plugins), the access strategy 
may be implementation specific but, as far as the hierarchy of 
interfaces (ElementCollection -> HTMLCollection) does not change the 
properties of an object implementing the HTMLCollection, that shouldn't 
be a lot to work around. For instance, a Java applet (as well as any 
other object implementing LiveConnect) should work fine using the 
JSObject without any modify, while a direct access to the DOM would need 
a DOMServiceProvider implementation (I'm not aware of any granting 
access to the 'children' attribute, or better, to any non-W3C DOM 
properties, but I guess as soon as your proposal became a recommendation 
at least Sun would update such in Java APIs); for such purpose, 
suggesting that any object provided by the user agent as implementing 
either interface should be wrapped by an object also implementing the 
other, for backward compatibility, might be enough (anyway, this is no 
more than a hint, a very early feedback).

I see the Element interface no more contains methods to handle Attr 
nodes: since those are described as not being child nodes of an Element, 
in W3C specifications, there will be any other way to handle attributes 
as nodes, the 'nature' of Attr nodes is going to change, or is there a 
too little use (and/or support) of them, such that the Attr interface 
might be quite close to its 'end of life'? Apart from that, I've also 
noted the 'isId' attribute has been removed from Attr; I was thinking 
just to that when I've read, in HTML 5 spec, that "This specification 
doesn't preclude an element having multiple IDs, if other mechanisms 
(e.g. DOM Core methods) can set an element's ID in a way that doesn't 
conflict with the id attribute". For this purpose, either the 'isId' 
property of an Attr node, or a mechanism to set an Element's attribute 
as an alternative ID (or both) might be helpful (anyway, having more 
then one unique identifier to handle for each element|| in a document 
might cause an increase in duplicated IDs).

The above takes me to the '.getElementsByClassName()' method: if it were 
to be moved from HTML 5 spec to Web DOM Core API, and if the latter is 
meant as some kind of replacement for W3C DOM level 3, perhaps, for 
generality sake, such method might be defined as referring to a property 
named CLASS (along the same lines as ID), pointing out that such 
property might not be binded to an attribute named 'class' (just to make 
the spec ready in case the need to support such sort of document arose 
in the near future, without having to change web dom core, or to derive 
a new version, only for this reason).

But now let's come to your questions (sorry for the digression, 
sometimes I can't help starting this way...)

>
>> But the term 'Subtree' arises a problem with HTML 5: actually, the id 
>> attribute is defined as the element unique ID in the *subtree* 
>> whithin which the element is found. That is, the term subtree refers 
>> to a whole document tree, but also to a disconnected subtree handled 
>> by a script (and I haven't yet understood if such definition refers 
>> to a document fragment containing nodes detached by any document, or 
>> a whole document without a browsing context).
>
> AIUI, it could also be a disconnected element.
>

And I've suggested, in another mail, to clarify it, i.e. telling a 
subtree is either a whole document (to make clearer that '<body><div 
id="the_id" >....</div> <div> ... <div id="the_id" /> </div> </body>' is 
always illegal, regardless the possibility to separate two different 
subtrees in the same document where the id 'the_id' is unique), or a 
group of one or more 'related' disconnected nodes (i.e. a node removed 
from a document with its descendants, a cloned node, a newly created 
one, and so on, that is, in practice, about any tree of nodes without an 
ownerDocument).

However, how is the ID uniqueness to be related to current APIs with 
respect to a disconnected subtree? I mean, if such were relevant for any 
method, such as a .getElementById variant handling disconnected 
elements, the "uniqueness rule" would be quite self-explaining: either 
such a method fails in front of duplicated IDs, or it returns the very 
first match (or does whatever else is established to degrade 
gracefully). But if the subtree traversal/id searching implementation is 
up to a script programmer, the 'id' attribute may be seen as any other 
attribute, and the programmer may opt for considering the uniqueness of 
IDs as non relevant to his/her code in 'off-document' manipulations, 
until the subtree is inserted into a document. Thus, maybe, IDs 
uniqueness should be related to anything being under direct control of 
current APIs (such as documents).

For the rest, my considerations on [whatever].getElementById() were 
general, not referred to a concrete use case (partly being driven by the 
above thoughts), and, for what concerns the email you've replied, 
started from the following:

On 12/2/2008 4:07 PM, timeless wrote:
> On Tue, Dec 2, 2008 at 10:39 AM, Aaron 
> Leventhal<aaronlev at moonset.net>  wrote:
>   
>> Maybe there is a deeper problem if copy&  paste doesn't work right 
>> because
>> of IDs?
>>
>> Or maybe there should be a node.getDescendantById() method?
>>      
>
> maybe, but not with that name.
>
>   Results 1 - 10 of about 4,480,000 for Descendent [definition]. (0.22 
> seconds)
>   Results 1 - 10 of about 8,370,000 for Descendant [definition]. (0.41 
> seconds)
>
> the wikipedia links are confusing enough
>
> http://en.wikipedia.org/wiki/Descendant links to:
> http://en.wiktionary.org/wiki/descendent
> which has an also link to http://en.wiktionary.org/wiki/descendant
> which has a 'US' audio file
>
> So the web says that '-dant' is favored 2:1 over '-dent', which is a
> fairly bad margin considering the spelling errors we've seen in
> html/http.
>
> I'd sooner see Node.getElementById and risk *the confusion of it
> returning fewer nodes than Document.getElementById.*
>
>

I guessed the confusion he was concerning might be caused by the general 
confidence people have with .getElementById working on the whole 
document, so that someone might think, after discovering somehow there 
is a getElementById method on every Node (or better, on every element in 
an HTML document), such being working on the whole document as well, 
thus expecting more non null results than a Node.getElementById method 
might return, in general (people anyway should be confident with 
Document.getElementsByTagName working on the whole document while 
Element.getElementsByTagName working on an element descendants... but 
without a crystal ball to guess it, who knows...). Thus, if the above 
were the case, I've suggested something like 
document.getElementById("the_id", startNode), to focus (perhaps!) the 
attention of the average programmer (the same who might have thought 
'anElement.getElementById' being the same as 'document.getElementById') 
on the different behaviour, by 'forcing' the indication of a new 
argument specifying where to start the search (that is, something 
totally new to the programmer).

I agree that encouraging duplicate IDs is not any good idea, and 
whatever method looking for IDs in a document subtree might take to 
that. I really don't like the bare idea of a duplicated ID -- it's a 
somewhat conflicting logic -- and I'm tempted to say a duplicate should 
always break correct execution, especially when the duplicate comes from 
an 'error' in scripting (as when inserting a cloned node without caring 
of side-effects), because, at first glance, I'd regard a duplicate ID as 
whatever else bug making a stand-alone application crashing because of 
side-effects (of course, I'm not telling the browser should crash!), 
and, after all, a gracefully degrading .getElementById (returning the 
very first matching element) cannot prevent side-effects. But I really 
understand, also, duplicate IDs are a common problem asking for a 
solution, or at least a standard graceful degradation (such as the one 
stated for document.getElementById(elementId)).

 From this point of view, and also having in mind a disconnected subtree 
to deal with through the API, maybe something like 
document.getElementById(elementId, rootElement) might make sense if it 
worked _only_ on disconnected subtrees, by first checking whether the 
passed node (or the first descendent not being a document fragment node) 
is effectively disconnected (not in the document), then returning the 
matching element (if any), or null if the traversed subtree is instead 
in the document tree (or even ending with an exception, to encourage a 
greater attention and immediately tell the reason of the failure -- 
returning null might be ambiguous, since that's the same result when no 
matching element is found).

But such method might be confusing, I agree (perhaps might confusion be 
avoided by calling the method getExternalElementById?), and might also 
provide a quite easy way to look for duplicate IDs (e.g. by creating a 
fake document and calling that method on a subtree of another, 
manipulated document -- but such would be a bit tricky, and if I do 
something tricky, I should know that's not the better way to achieve 
some result).

---

Now, let me fly back again off topic on Web DOM Core. Sometime elements 
are conceived to be somewhat leaf elements, conceptually not being 
enabled to have any sort of descendants (this is the case for <br> or 
<base>, i.e., in HTML); yet they're Elements, and Nodes (and 
HTMLElements), thus having attributes and methods allowing, in theory, 
some weird manipulation. I don't know what every browser currently does 
(honestly, I've never tried -- this is a recent doubt of mine), but I 
guess the result might vary from one UA to another, and something 
inappropriate might happen (in any case, there is no standard way to 
block such). Perhaps, a new (readonly) property might be defined on Node 
telling the node is a "definitive leaf", so that whatever 
property/method might be declared as inaccessible/failing (e.g. throwing 
an exception) if the 'isLeaf' property were true -- such an attribute 
cannot be neither an Element, nor an HTMLElement, nor an HTMLElement 
derived interface property, since there are methods on the Node 
interface involved too, and as well the nodeType attribute wouldn't cope 
well with this, because that should work for instances of the same type, 
or regardless the type -- in other words, when trying and changing the 
descendant hierarchy of a node with a true value for such an attribute, 
the result should be an illegal hierarchy. Do you think such might be 
worth to be considered?

(before I forget it, current definition of legal hierarchy seems to cut 
out some legal cases, such as, for instance, an Element with no Text 
child nodes, and the alike -- perhaps, should it be converted into a 
list of illegal hierarchies? the resulting list would cover fewer cases 
than a complete list of legal scenarios - if I haven't misunderstood it).

Best Regards,
Alex.

 --
 Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP autenticato? GRATIS solo con Email.it http://www.email.it/f

 Sponsor:
 CAPODANNO A RIMINI HOTEL 2 STELLE
* 2 notti pernottamento con colazione a buffet euro 70,00, 3 notti euro 90,00
 Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8500&d=7-12