[whatwg] The blockquote element spec vs common quoting practices

Sun Jul 17 11:36:36 PDT 2011

17.07.2011 18:07, Nils Dagsson Moskopp wrote:

>>> But browsers need to be told that that number close to the quotation
>>> is an ISBN.
>>
>> The string “ISBN” is sufficient evidence of that.
>
> Someone would need to standardize “ISBN sniffing behaviour” for UAs
> then. Could you make a proposal?

I think it would be rather trivial. The string “ISBN” followed by 
something that matches the syntax of ISBN numbers, perhaps allowing some 
variation in punctuation, could be treated as an implicit link to a 
resource _if_ you have some mechanism(s) for mapping ISBN numbers to URLs.

The key issue is whether browser vendors have interest in it and which 
mechanism(s) would be used. After all, an ISBN could be in a multitude 
of ways, like querying an online bookshop, querying an online 
bibliographic system, or querying an site of books in digital format 
online. Which one should be used? Would it be useful? To be really 
useful, it should be handled so that the browser checks what it can get 
using the ISBN and then make that information available to user (how to 
get bibliographic info, how to read reviews, how to buy the book, how to 
borrow it in a library, download or read the book via the net for free 
or for fee).

> Are any reasons for not doing anything with that information known?
> Probably a more basic issue: Is the cite attribute actually used?

I don’t think it’s much used in the wild, except on pages by 
organizations that define HTML specs. What might be the motivation for 
browsers to do something special with it? Surely you could make things 
so that by clicking on a blockquote, the user accesses the resource 
pointed to by the cite attribute. Browsers could do that, and so could 
authors. But would users actually start clicking on quotations to see 
their sources? Surely they would far more probably click on the title of 
a work in visible credits if present and if it is a link, so what would 
the cite attribute help?

>>> <Cite>  contains a human-readable name of a work. That'll
>>> rarely be machine-readable.
>>
>> HTML documents are always machine-readable. (Well, you _might_ just
>> write HTML on a paper with a pen…)
>
> This is a category error. “Machine-readable” in this context does not
> mean “digital information”.

No, it’s not a category thing. It’s about the relativity of being 
“machine-readable.” You are probably thinking of data in a specific 
format designed to be easily parseable and useable by computer software, 
such as a URL, an ISO 8601 date notation, or an XML tag. But browsers 
already do many kinds of heuristics, parsing data that doesn’t really 
match the specs.

A title of a work is easily useable by software: put it inside quotation 
marks and throw it at Google, and the odds are that you get some useful 
links related to it, if there’s info on the work (and perhaps the work 
itself) on the web at all. Well, assuming that the title is relatively 
unique.

Titles of works are often more useful in the long run than URLs. URLs 
change far too often when sites are revamped or for other reasons.

I think a good start would be to add an optional (but usually 
recommended) <credits> or <source> element for use inside <blockquote>.

-- 
Yucca, http://www.cs.tut.fi/~jkorpela/