[whatwg] Authoring Re: several messages about HTML5

Charles McCathieNevile chaals at opera.com
Wed Feb 21 01:47:50 PST 2007

On Wed, 21 Feb 2007 03:40:09 +0100, Lachlan Hunt <lachlan.hunt at lachy.id.au> wrote:

> Vlad Alexander (xhtml.com) wrote:
>> Thank you Ian. Just one follow-up question. You wrote:
>>> ...We could require editors to do this, but since nobody knows how
>>> to do it, it would be a stupid requirement. ...
>> Is it due to a flaw in HTML that it is difficult to build authoring
>> tools, such as WYSIWYG editors, that generate markup rich in
>> semantics, embody best-practices and can be easily used by
>> non-technical people? Since much of the content on the Web is created
>> using such authoring tools, can we ever achieve a semantically rich
>> and accessible Web?
> It's not so much a flaw in HTML's design, as it is the refusal of
> popular WYSIWYG editor vendors to replace common presentational UIs,
> such as font styles and colours, with much more useful semantic UIs.  I
> don't believe it's particularly difficult to achieve.  It just requires
> thinking outside the box a little and not simply copying what typical
> word processing software has done in the past.

(Summary: What Hixie said. But I show my reasoning ;) ).

Hmmm. It is more complex than that, but not much. It is very easy to use Word to create good clean structured content, which can be straightforwardly transformed to good clean semantic HTML (or PDF for that matter). And likewise, it isn't that hard with other tools. The thinking outside the box a bit has been a solved problem since before Word was a WYSIWYG tool. Deployment, and getting the right features used, is a different story.

The trick is in the workflow, or ecosystem, or whatever you call it. People started out in the 80s learning to press the BOLD button, or the larger font button, to make headings. This was easy - it matched what they had done with a typewriter or more commonly with a pen at school, and the set of semantics available matched the small number of possibilities (CAPS, underlining, bold or italic with a fancy golfball typewriter. As people got printers and desktop publishing a few people made the crazy multi-font unreadable pages that were all the rage in the mid-80s (just as multi-coloured headings like the google logo were all the rage in the mid-90s - but seem to have all but vanished now except for that example). Those quickly died out in the serious world, where people used the increased variety in typesetting to create documents that were visually very semantically clear. Even today, calls for academic papers will require very strict formatting for photo-ready copy, and there must be millions of documents adhering to the most popular of these formats.

What is lacking, all down the line, is the idea of understanding that the Web is not just a simple visual representation. Most people still don't think of looking at the Web on their phone as "real" (at least in the small group of western countries that dominate development of Web standards). Nor do most people know a lot about accessibility. Most people comissioning content want it to look good on their machine, and assume it looks good to everyone else too (after all, how many people are prepared to admit they have horrible design sense and make things that most people find ugly?).

Which reinforces the initial patterns we have for learning how to use tools. In most places children still learn to write with a pencil, which shapes their idea of conveying semantics. Then they are taught to use a keyboard, and convey semantics by people who learned it a long time ago, when visually was the only meaningful way. And then they get tools for creating the Web. Surprise surprise, in general the people who produce content (and are experts in content and couldn't care less about the underlying structure of the Web) keep doing what they have always done, and look for tools that are easy to use.

We are trying to change the way the world thinks about semantics. Little by little, it happens. Little by little people are realising that accessibility, or being friendly to mobile devices, or being able to repurpose the things they write, or something else, means that noting the underlying semantics is valuable. In a century we have made a little progress in changing things. People use underlining much less than they used to as a highlight, since they have learned that it means a link.

This change of approach to expression takes a long time to filter through to the world. One of the crucial things is working very hard on authoring tools, since only a tiny minority really writes code by hand. The tools themselves are often hand-crafted to produce code, but even that is not a given (and IMHO regarding that as a valid constraint is missing the most important point of the last decade of markup language history). The changes will come (and are coming, bit by bit). But looking for a rapid sea-change is somewhat naive. The old ways will last for decades, come what may. If we spend too much time working out how to make them seem reasonable and valid, we can prolong their life by generations.

The question is how to really promote the things that we want to see - the production of semantically rich and therefore more useful markup. If we are not working with authoring tool developers, we are probably not doing ourselves any favours. If we are not looking at the whole thread that leads to how people learn to express themselves, we're probably going to find that we can never make much progress. (And when we do, we start to deal with things like the semantics of graphics, or of sound, or motion. All of these are critical to people's expression now, in some cases more so than the semantics associated with text documents. And then it becomes interesting...)



 Charles McCathieNevile, Opera Software: Standards Group
 hablo español  -  je parle français  -  jeg lærer norsk
chaals at opera.com          Try Opera 9.1     http://opera.com

More information about the whatwg mailing list