[whatwg] Deprecating <small> , <b> ?

Tab Atkins Jr. jackalmage at gmail.com
Tue Nov 25 14:02:02 PST 2008


On Tue, Nov 25, 2008 at 3:08 PM, Calogero Alex Baldacchino <
alex.baldacchino at email.it> wrote:

> Tab Atkins Jr. ha scritto:
>
>>
>>
>> On Tue, Nov 25, 2008 at 10:24 AM, Calogero Alex Baldacchino <
>> alex.baldacchino at email.it <mailto:alex.baldacchino at email.it>> wrote:
>>
>>
>>    Of course that's possible, but, as you noticed too, only by
>>    redefining the <small> semantics, and is not a best choice per se.
>>    That's both because the original semantics for the <small> tag was
>>    targeted to styling and nothing else (the html 4 document type
>>    definitions declared it as a member of the fontstyle entity,
>>    while, for instance, <strong> and <em> were parts of the phrase
>>    entity), and because the term 'small', at first glance, suggests
>>    the idea of a typographical function, regardless any other related
>>    concept which might be specific for the English (or whatever else)
>>    culture, but might not be as well immediate for non-English
>>    developers all around the world. As a consequence, since any
>>    average developer could just rely on the old semantics, being he
>>    intuitively confident with it, the semantics redefinition could
>>    find a first counter-indication: let's think on a word written
>>    with alternate <b> and <small> letters, or just to a paragraph
>>    first letter evidenced by a <b>, obviously the application of the
>>    new semantics here would be untrivial (i.e. an assistive software
>>    for blind users would be fouled by this and give unpredictable
>>    results). Despite the previous use case would be a misuse of the
>>    <b> and <small> markup, yet it would be possible, meaning not
>>    prohibited, and so creating a new element with a proper semantic
>>    could be a better choice.
>>
>> No matter *what* we do, if there *is* a default style for an element, it
>> will be misused by people.  This is a fact of life.  Defining a new element
>> which is identical to <small> in every way except that it hasn't been
>> misused *yet* is thus a mug's game, because it *will* be misused in the same
>> way as <small>, and then we just have two identical elements for no reason.
>>
>
> I'll start with an example. A few time ago I played around with Opera
> Voice. It seemed to be capable to interpret visual style sheets and
> specifically font styles, so that bold or italics text (so constraint in the
> style sheet, not the markup) were spoken differently from 'normal' text, but
> a paragraph first letter differing from the rest of the word (which is a
> non-rare typographical choice), as far as I remember, caused the whole word
> to be skipped.


Do you mean that if you had markup like "<p><b>W</b>hen I was young...</p>",
it would be read out as "I was young..."?  If so, that's clearly a bug in
the reader, and has nothing to do with semantics or the lack of it.  There
is *no* legitimate interpretation of that markup that would lead one to
discard the first word.


> This suggests me that if we really want a 'cross-presentation' semantics,
> we have to keep as far as we can from anything having a *main* typographical
> semantics (as <small> and <b> have from their birth). Every language is
> somehow prone to side-effects caused by misuse (i.e. it is possible to cause
> a big mess in a software written in a language allowing to pass a pointer to
> a function - there are tons of examples for language design issues - yet
> such could be a desireable capability), but appropriate choices for both
> semantics and syntax may help to reduce the likelyhood of a misuse.
>
> I think that very likely both <b> and <small> will carry on their old
> semantics, so being more prone to misuse with respect to their new one,
> since very likely a lot of developers are, and will rest, more confident
> with their original semantics, which is also suggested by their names ('b'
> standing for 'bold' and 'small'... for something small on the screen or on
> paper). Instead, a new element would require the developer to take some
> effort at least to learn about its existence, so he would read that such
> element primary use is to indicate a different importance of a piece of
> text, so that a non visual user agent can present it in an appropriate
> manner, and a visual or print user agent can render it in different ways.


Well, the new semantics are purposely very close to the old 'semantics'.
Bold text *is* "text purposely offset from the surrounding prose".  Some
legacy uses of <b> are more correctly done with other existing elements,
like <strong> or <h1>, but at least it's *close*.

And again, the type of author who *is* marking up random things with <b> for
purely stylistic reasons isn't the sort who is going around reading
standards documents, or likely even caring in the slightest.  If they *did*
discover a new element that has the correct semantics (like <standout> or
something), they'll either ignore it (if it's basically identical to <b>) or
use it nonsemantically as well (if it offers some exciting new default
styling).

Basically, there is a subset of authors who are morons, and they'll screw up
anything we do.  Most of us aren't like that, but trying to design around
that subset is a game you can't win.  Their pages will be FUBAR no matter
what we do, until browsers' rendering engines are literally hooked up to a
sentient semantic parser.


> Ah, the default style could be slightly or very different from the <small>
> one, i.e. the text could be surrounded by parenthesis or hyphens, despite of
> the font size (and the new elements could be designed such to accept just
> non-empty strings consisting of more than one non-spacing character).


We could, but is there any reason to have it do that?  Making the text small
is a good visual representation of the "small print" or "aside" semantics.


Yes, bad markup will foul up semantic agents.  But people will *always*
>> write bad markup.  At least with the semantic redefinition we get to declare
>> lots of usages that *are* appropriate to be conforming without any effort on
>> the author's part.
>>
>> And really, the type of people who would write a word with alternating
>> letters wrapped in <b> and <small> tags are hardly the kind to even *care*
>> about semantics.
>>
>
>
> Let me reverse this approach: what should an assistive user agent do with
> such a <b>M</b><small>E</small><b>S</b><small>S</small>? I think that
> dealing with that word as normal text would be a more gracefull degradation
> than discarding it, and if we clearly state that <b> and <small> have only
> typographical semantics, while different elements are provided to
> differentiate the grade of emphasys of a phrase, an assistive user agent
> could support a better behaviour, while any author disregarding semantics
> would not cause any trouble (the <b> and <small> wrapped alternating
> characters example may be unrealistic, but a paragraph could actually start
> with a bold and bigger first letter using <b> and <font> instead of style
> sheets).


With luck, it will safely barf and speak out the text normally.  After all,
I don't think there *is* a way to aurally distinguish individual letters
within a single word, especially not without at least a child's level of
speech comprehension (and of course aural browsers are nowhere near that).
Even if that abortion of markup *did* have a legitimate semantic purpose, I
simply don't think there's a good way to represent that aurally, and so it
should be ignored.

And again, if a new element has a default style (visually or aurally), it
will be abused.  If we said today that <b> was completely stylistic and is
deprecated, but <offset> was introduced with the meaning of text that is
somehow offset from the surrounding text and has a default rendering of
bolding its contents, you'll see
<offset>M</offset><legalese>E</legalese><offset>S</offset><legalese>S</legalese>es
still getting created on the wild web, whether by authors who'd heard that
<b> was being replaced by <offset> (and <small> by <legalese>) but didn't
get the memo talking about *why* the new elements were introduced, or by
non-coders using an automated authoring tool providing those tags in an
attempt to help its users be semantic.  If they want small letters, they'll
get small letters, and if there's a button on the panel that makes letters
small, they'll do that rather than poking around in CSS files, semantics be
damned.  If you think this is baseless speculation, witness the horrible
misuses of <strong> and <em> currently on the web to mark up headings and
quotes and such, just because those are the buttons on the authoring tool
that make text bold and italic.

In other words, it's futile to expect that any new element will be free from
the abuses that <b> and friends have received.  Proposing a new element
nearly identical to them really needs a stronger justification than "This
time, it'll only be used semantically!".

   But, you're right, we have to deal with backward compatibility,
>>    and redefining the <small> and <b> semantics can be a good
>>    compromise, since a new element would face some heavy concerns,
>>    mainly related to rendering and to the state of the art
>>    implementations in non-visual user agents (and the alike).
>>
>>    However, I think that a solution, at least partial, can be found
>>    for the rendering concern (and I'd push for this being done
>>    anyway, since there are several new elements defined for HTML 5).
>>    Most user agents are capable to interpret a dtd to some extent, so
>>    it could be worth the effort to define an html 5 specific dtd in
>>    addition to the parsing roules - which aim to overcome all
>>    problems arising by previous dtd-only html specifications - so
>>    that a non html5-fully-compliant browser can somehow interpret any
>>    new elements. HTML 5 Doctype declaration could accept a dtd just
>>    for backward compatibility purpose, and any fully compliant user
>>    agent would just ignore such dtd. More specifically, such a dtd
>>    could define default values for some attributes, such as the style
>>    attribute (to have any new element properly rendered - some
>>    assistive technologies are capable to interpret style sheets too),
>>    and, anyway, there should be a way, in SMGL, to create an alias
>>    for an element (i.e., a new element - let's call it <incidental> -
>>    could be aliased to <small> for better compatibility).
>>
>>
>> Html5 is no longer an SGML language.
>>
>
>
> I know, and agree with the basic reasons; however I think that deriving an
> SGML version (i.e. by adding new entities and elements, as needed, to an
> html 4 dtd) should not be very difficoult, and could be worth the effort
> (i.e. to graceful degrade the presentation of a menu element thought as a
> context menu, wich content should not be shown untill a right click happens
> - if the u.a. cannot handle it, not showing it at all could be a reasonable
> behaviour). The derived sgml version should be aimed just for older
> browsers, while "newer", html 5-aware ones should just ignore any dtd
> reference. I'd consider this chance, at least on the fly - I suspect that
> the complete break out with the earlier sgml specifications might carry in
> an undesireable side-effect: from one side it solves the problems arised
> from sgml partial support/bad implementation and from browser-specific
> quirks, but from the other side no mechanism is provided to make
> sgml-somehow-based user agents to gain whatever awareness on the newly
> defined elements.


That seems like a lot of work for little to no reward.  I dont' believe that
existing browsers actually *read* the DTDs, but rather simply use them as a
switch for their rendering engines.

Really, the existing browsers aren't SGML based at all; instead, they're
simply html or xml based.


>    Let's come to the non-typographical interpretation a today u.a.
>>    may be capable of, as in your example about lynx. This can be a
>>    very good reason to deem <small> a very good choice. But, are we
>>    sure that *every* existing user agent can do that? If the answer
>>    is yes, we can stop here: <small> is a perfect choise. Better:
>>    <small> is all we need, so let's stop bothering each other about
>>    this matter. But if the answer is no, we have to face a number of
>>    user agents needing an update to understand the new semantics for
>>    the <small> tag, and so, if the new semantics can be assumed as
>>    *surely* reliable only with new/updated u.a.'s (that is, with
>>    those ones fully compatible with html 5 specifications), that's
>>    somehow like to be starting from scratch, and consequently there
>>    is space for a new, more appropriate element.
>>
>>
>> I don't understand.  If some obscure UA can't extract an appropriate
>> meaning from <small> and come up with a device-appropriate rendering, why
>> does that mean we should have a new element?
>>
>
>
> Smylers himself stated that if we had to create html from scratch 'small'
> might not be the best name for an element with the semantics he was
> suggesting, but it is a good choice because we are dealing with an evolving
> language and its backward compatibility issues. He also said 'small' is good
> because most non-visual, non-printing user agents, such as textual ones (as
> lynx), are capable to interpret <small>/<b> in a suitable manner. From this
> point of view, I think that the 'goodness' of <small> might depend on the
> real number of UAs capable to avail of it without any trouble in most
> situations (specifically, the real number of textual/assistive UAs giving to
> <small> the same semantics as html 5 specs should redefine); if we could
> find a big-enough number of situations (i.e. <small> use cases) where the
> behaviours vary a lot both from UA to UA and, for each UA, from the wanted
> behaviour for the 'new' semantics, we could also conclude that such a
> semantics needs to be added to any existing UA to be really reliable, so we
> had not to discard the idea of a new element for such a semantics with the
> rationale of  a backward compatibility or "state of the art" implementation.
> However, I understand such a casistic might be untrivial to estimate, but
> even for this reason, taking a conservative position with respect to a worst
> case scenario, I wouldn't disregard the opportunity to create a new element
> with the right semantics, instead of changing/adding semantics to an
> existing one.


I would be interested in finding out what sort of treatment various
"alternative" UAs give to <small> and pals.


>    Apart from considering that <b> isn't a good choice in such a case
>>    (<strong> or <em> are far better, since they were born with the
>>    proper semantics), personally I hope in the near future every user
>>    agent (or at least any thought for human interaction) will be
>>    style-sheets compatible (meaning at least capable to draw
>>    importance and order from them), so that anything related to
>>    presentation, in any presentation media, would be separable from
>>    content.
>>
>>
>> No, Smyler's example was referencing things that specifically should *not*
>> be marked up with <strong> or <em>.  They're not being emphasized nor are
>> they of greater importance than the rest of the text - they are merely
>> purposely offset from the surrounding text for some reason (besides emphasis
>> or importance).
>>
>
>
> Here it is me not understanding. I think that any reason to offset some
> text from the surrounding one can be reduced to the different grade of
> 'importance' the author gives it, in the same meaning as Smylers used in his
> mails (that is, not the importance of the content, but the relevance it gets
> as attention focus - he made the example of the English "small print" idiom,
> and in another mail clarified that "It's less important in the sense that it
> isn't the point of what the author wants users to have conveyed to them;
> it's less important to the message. (Of course, to users any caveats in the
> small print may be very important indeed!)"). From this point of view,
> unless we aimed to avail of <b> as an intermediate grade of relevance
> between 'normal text' and 'em/strong' (but, aren't these enough to attract a
> reader's attention?), redefining its semantic might be redundant with lesser
> utility. (In my crazy mind, this applies to the headings too, since a 'good'
> heading focuses attention on the core subject of its following section, so
> have to be evidenced as an important slice of text). Furthermore, I meant
> that <strong> and <em> would have been a better choice than <b> in Smylers'
> examples because their *original semantics* is very close together with that
> of "a more relevant text/a text needing greater attention", while <b>
> *original semantics* is very different and needs to be redefined for this
> purpose (but we have still got possible alternatives to this).


Agree to disagree, I guess.  I don't find "We hope you'll find <b>Product
A</b> to be the best laundry detergent you've ever used!" to be denoting
emphasis or importance, really.  <i> has more obvious non-emphatic uses,
such as marking up foreign-language words, linnean classifications, and
such.


> Anyway, I'm not against a possible redefinition of <b> and <small>
> semantics, but just aiming to deeply explore any alternative (such as
> introducing new elements) while the specifications are in their draft state.
> Just trying to give an alternative point of view with some valid
> argumentations, if I can find some, nothing more (and hope I'm not giving a
> different impression). Best regards.


No problem.  Just because I disagree doesn't mean we can't argue.  ^_^

~TJ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20081125/4a75d89b/attachment-0001.htm>


More information about the whatwg mailing list