<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD>

<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">

<META content="MSHTML 6.00.6001.18148" name=GENERATOR>

<STYLE></STYLE>

</HEAD>

<BODY bgColor=#ffffff background="">

<DIV dir=ltr style="MARGIN-RIGHT: 0px"><FONT face=Arial size=2>>>>If we 

wish to communicate that level of semantics, yes.  It may not be useful to 

us.  If you *really* need some metadata/semantics, @class probably can't 

convey it with enough granularity.  Check out the big discussion from a few 

months ago about ccRel and RDFa.<BR> </FONT></DIV>

<DIV><BR><FONT face=Arial size=2>Not yet maybe, but we could at least try to 

keep options open for the future.<BR></DIV></FONT><FONT face=Arial 

size=2></FONT>

<DIV dir=ltr style="MARGIN-RIGHT: 0px"><BR><FONT face=Arial 

size=2>>>Second: Suppose I want to collect all copyright notices from 1000 

websites (don't ask me why, I just want to), how am I to do this when they are 

marked up in <small>s? I will definatly end up with a lot of text that has 

nothing to do with copyrights (and probably miss a lot of copyright notices as 

they are marked up differently) Whereas If they were maked up in (for example) 

<span class="copyright"> I could retrieve it all based on the 

class-name.<BR><BR>>>>That would be a wonderful perfect world.  

I'd like the copyright date as well, so I can retrieve only things copyrighted 

in the last ten years.  Assuming that metadata will exist is a fool's 

errand.  The fact is that if you are searching for copyright notices, the 

most efficient way is likely to just search for the string "copyright" and the 

(c) symbol.  That'll net you copyright notices with a high accuracy, and 

some training on real data can yield further rules to improve the data-mining 

accuracy.<BR><BR>You say it yourself, only in a perfect world where all websites 

in the world would be written in the same language would your "solution" work. 

Unfortunatly I would miss out on all the chinese copyright stuff.<BR>But another 

example (based on "siemens") wouldn't it be nice if I could tell Google I am 

looking for a person named "Siemens" so it would ignore the 

"brand"-name?<BR><BR><BR>>>>While we're hoping for copyright notices to 

be marked up as <span class="copyright">, though, why not wish for 

<small class="copyright">?  If you're going to be providing metadata, 

it works the same.  Is it that you believe people won't provide a special 

class for copyrights if the <small> tag already gives them the preferred 

display?  Do you believe that everyone will automatically use 

class="copyright" to mark up their copyright notices?  What if they use 

class="copyright-notice"?  Or class="license"?  Or any of a million 

other distinct possibilities that would destroy any naive attempt to datamine 

based on a particular class name?<BR><BR><BR>Well, that would have to be defined 

in the standard, wouldn't it? I'm not saying -again- it should be defined NOW, 

but at least leave the door open.<BR>I have no problems with using small over 

span, neither one is correct as far as I can see, in this context. Using 

"copyright" instead of "license" or "copyright-notice" would have to be defined 

somewhere, either in the standard or in an externally maintained "document" that 

is accepted as "best practice" or "standards related".</FONT></DIV>

<DIV dir=ltr style="MARGIN-RIGHT: 0px"><FONT face=Arial 

size=2></FONT> </DIV>

<DIV dir=ltr style="MARGIN-RIGHT: 0px"><FONT face=Arial size=2>PS: I find it 

very difficult to respond to rich-text/html messages as they seriously mess up 

the indentation. Sorry therfor if this message is unclear as original message 

and reply are mixed up.</FONT></DIV></BODY></HTML>