[whatwg] <blockquote cite> and <q cite>

Wed Jan 3 07:24:01 PST 2007

On Jan 3, 2007, at 16:28, Benjamin Hawkes-Lewis wrote:

> Henri Sivonen wrote:
>> First, we should consider how people writing for traditional print
>> media would express quotations and sources.
>
> I agree we should consider this, but why should we consider
> this /first/?

Because it establishes the baseline that is compatible with the way  
most people think of writing and upon which markup would be expected  
to improve.

If improvements above that baseline turn out to be very expensive in  
proportion to the benefit, we should just stick to the baseline.

>> (They'd use typographic conventions and words.)
>
> It's print media. What else could they use?

Nothing. Yet, the quotation and source get communicated to human  
readers.

>> Then we should consider if this is enough for
>> the Web or whether there could *realistically* be cases where
>> consuming software could serve users notably better for non-niche use
>> cases if there was more data available (i.e. big wins--not just
>> chasing diminishing returns).
>
> Blogs, comment threads, forums, academic writing, books,  
> journalism, and
> emails are not "niche use cases". In all of these cases, there is a
> clear advantage to making it easy for authors to create accurate
> quotations where the reader can easily get information about the  
> source
> and jump to the source.

You can accomplish that in a way that authors understand by using the  
conventions that you'd use on the print media plus making the piece  
of text that names the source a usual HTML link (plain <a href...).

>> If it turns out that having additional data would be a big win, we
>> should consider the cost and incentives
>> of providing that additional data and whether authors can
>> realistically provide the additional data (i.e. do they even know
>> it).
>
> With print-style quotations, they need to know a lot of "additional
> data" about the quoted work, and then they need to consult their  
> manual
> of style to work out how on earth to cite it. With the sort of
> machine-processable cited quotations I am advocating, they need to  
> know
> far less about either the quoted work or style conventions.

Assuming that putting an ISBN URI in attribute somewhere solves  
anything on its own is an illusion. If the source is hidden metadata,  
it is mostly useless, because the reader doesn't read it. If the UA  
is expected to render the information about the source somehow, the  
problem of presenting sources is just moved to a different place.

> For example,
> found some text you want to quote in a web page? Select it, click  
> "Copy
> as quotation". Go somewhere else, and click "Insert as quotation." Or,
> for example, found some text you want to quote in a book? Go to the
> insertion point, click "Insert quotation", fill in an ISBN (or author,
> title, date, or select from a list of remembered works), fill in a  
> start
> and end page, fill in the quotation text, and you're done.

I'll be more convinced about "tools will save us" once I first see  
that working on a smaller scale than the Web. Let's say TeXlipse with  
the described UI generating BibTeX entries and inserting the proper  
\cite{} on the LaTeX side.

>> If this analysis suggests that authors would be able and
>> incentivized to provide the additional data, only then should we
>> design markup for it.
>
> If they're citing materials at all, then they already are
> providing /more/ additional data then my vision of how this should  
> work
> would require.

How much data is provided depends on the type of writing. If someone  
quotes someone else's blog, quotation marks and a plain <a href link  
back are enough. For an academic paper, a professor or a peer  
reviewer is going to want more data, but still it isn't realistic to  
assume that non-technical bloggers would be bothered to provide much  
more than the link to the source.

>>> Requiring ordinary end-users to do /any/ of the following
>>> tasks by hand seems unrealistic:
>>
>> Indeed.
>
> Indeed, so why are you suggesting we require them to do task 4  
> (which is
> one of the hardest)?

If you want to designate the source, you need to know it and express  
it in a human-readable way. I am not suggesting any particular  
formatting. Just doing something that readers will understand. Doing  
anything less than that would amount to not designating the source.  
(Obviously.)

>> Or, authors could simply not mark up the sources of quotations
>> unambiguously leaving it to readers to cope with the relationship of
>> quotations and sources the same way readers of papers publications  
>> do.
>
> What possible advantage would that provide?

Not having to bear the cost of producing semantic quotation markup.

Metadata and semantic markup are *not* without a cost.

>> If the spec is too "out there", it gets ignored.
>
> Out where?

Out of the feature range that UA implementors might want to implement  
or what normal people want to support on the authoring side.

>> Most notably, links are used on the Web to achieve a clear behavioral
>> goal in real software.
>
> The behavioral goal is every bit as clear here: to make it easy to  
> quote
> stuff from somewhere else in such a way that people can:
>
> a) get information about the quotation's source
>
> b) go to the quotation's source
>
> This couldn't be further from semantics for the sake of semantics.  
> It's
> as fundamental as <input type="text">.

“Quotation” (<a href='...'>Source</a>)

Punctuation and plain links go a long way for human readers. And I am  
unconvinced that authors would be willing to spoon feed data mining  
tools, considering that the beneficiaries of such spoon feeding are  
not the authors themselves nor even their direct human audience.

-- 
Henri Sivonen
hsivonen at iki.fi
http://hsivonen.iki.fi/