[whatwg] sic element

Jukka K. Korpela jkorpela at cs.tut.fi
Sat Aug 6 22:42:31 PDT 2011


Bjartur Thorlacius wrote:

> Þann þri  2.ágú 2011 09:04, skrifaði Henri Sivonen:
[...]
>> From time to time, people want to take printed matter an
>> publish it on the Web. In practice, the formats available are PDF and
>> HTML. HTML works more nicely in browsers and for practical purposes
>> works generally better when the person taking printed matter to the
>> Web decides that the exact line breaks and the exact font aren't of
>> importance. They may still consider it of importance to preserve
>> bold, italic and underline
[...]
> So you're arguing that a subset of HTML should be favored over
> presentational markup languages for marking up digital retypes of
> printed matter, with <b>, <i>, <u>, <font>, <small> and <big> be
> redefined to their HTML 3 typographical meanings.

I can't speak for Henri, but agree with his point quoted above. There are of 
course other options as well, such as images and word processor file 
formats, but they have the same problem as PDF: they preserve too much of 
the formatting.

Please note that this isn't about favoring HTML over presentational markup 
languages; none of the alternatives mentioned is a markup language at all. 
HTML has always been a presentational markup language, too, and HTML as 
officially defined (HTML 4.01, XHTML 1.0) still has presentational features, 
so the question is whether they should be taken away, not about "redefining" 
them. It is the WHATWG & HTML5 work that is proposing a redefinition. (I say 
"proposing", since from the viewpoint of implementor, author, and user 
communities as well as the W3C, they are proposals, not a standard. In many 
parts of HTML, the proposal has widely been or is being accepted in 
practice, but I see little signs of such things happening with the new 
meanings for <b> and friends.)

> And perhaps
> <blockquote> standardized to mean indent.

I wouldn't object to that, but _that_ would mean a change to the tradition 
of HTML specifications, and although <blockquote> mostly means "indent", it 
fairly often means a block quotation. Moreover, the situations where an HTML 
author needs to say "this text is indented in the printed original" without 
presenting any fixed interpretation of the intended meaning of indentation 
appear to be rather rare, as compared with situations where one needs to say 
e.g. "this text appears in italics in the printed original".

> If you simply retype print without any interpretation of the
> typography used, a valid speech rendering would e.g. cue bold text
> with "bold" and "unbold" marks to convey the meaning: this text was
> bold.

It could, and that would actually reflect the authors intentions: he wishes 
to convey the idea of bolding, leaving it to the reader to infer or guess 
the meaning of bolding. (At the extreme, you might have a page that 
discusses a printed document in general and the use of bolding in it in 
particular, and then it is surely relevant to indicate the bolding - as 
"pure bolding".) In practice, speech rendering doesn't behave that way, but 
even if it did, it would constitute an argument in favor of the typographic 
markup, not against it.

> If all you want is to suggest original typographic rendering, then
> (save for Excerpt/Blockquote, Nofill/Pre and Lang/@lang) CSS does the
> job, better - and is vastly more powerful.

This isn't about suggesting, this is about reproducing aspects of printed 
material that may be essential. It is comparable to making a distinction 
between lowercase and uppercase, which may be purely presentational or may 
carry essential information. The case distinction can be made by the simple 
choice of letters at the character level, or it may be delegated to CSS if 
it is regarded as purely presentational. For bolding etc., the 
character-level alternative does not exist or it is highly impractical (and 
e.g. mathematical italics letters are, in addition to being present in a few 
fonts only, intended for mathematical use rather than common use of 
italics). So all I'm asking is to preserve the existing features of HTML or, 
more exactly, preserve them without declaring them as obsolete.

Yucca 




More information about the whatwg mailing list