[whatwg] Spellchecking proposal #2
lachlan.hunt at lachy.id.au
Sat Jun 24 07:02:45 PDT 2006
L. David Baron wrote:
> The problem is that heuristics are only heuristics when they operate
> on input written without knowledge of the heuristics. When the input
> was written with knowledge of the heuristics, they become de facto
> Authors will learn what triggers spellchecking (or not) in Mozilla,
> and write whatever markup, however inappropriate, gives the choice of
> spellchecking that they want. Then other browsers will be forced to
> copy whatever Mozilla did.
Theoretically, if the heuristics are written well enough, such that
authors providing accurate information end up with the best usability by
default, that shouldn't happen. If the heuristics are so bad that
authors are left with little choice but to lie to improve the usability,
then, yes, we'd end up with exactly that problem.
However, in reality, I'd have to admit that such good heuristics are
going to take a long time to research and develop well; and, especially
in the early stages, probably won't be accurate enough for authors to
rely on all the time.
> So if we're going to end up with a standard anyway, why not admit it
> and figure out what it should be rather than ending up there
Yes, I'd rather come up with a less-harmful solution now, regardless of
semantic purity, than to repeat the mistakes of the past again and
ending up with a more harmful defacto standard.
The main problem with providing an explicit spell checking switch to the
author is the potential for abuse. History has shown that authors will
attempt to disable anything they don't like for any reason whatsoever,
regardless of the usability benefits such features provide for users.
We've seen that already with all of the following:
* IE's smart tags: <meta name="MSSmartTagsPreventParsing" content="True">
* Google AutoLink (Some scripts were developed to workaround this)
* IE's image toolbar: <meta http-equiv="imagetoolbar" content="no"> and
* AutoComplete (autocomplete="off")
* Showing link URLs in status bar (using window.status)
* Removing browser chrome (in popups)
* View Source (includes attempts to obfuscate source code with JS,
disabling context menus, etc.)
* Disabling printing (Some JS, works in IE only)
* Disabling Save As..., (Some JS, works in IE only)
* Disabling caching
* And anything else they can get their grubby little hands on!
I could easily imagine authors wanting to disable spell checking simply
because the squiggly red underlines clash with their site's colour scheme.
However, the proposed spellcheck attribute has one major advantage over
all of those: it's being designed to allow the user to easily override
it if they want to. I'd expect the result of that to be that authors
won't bother doing so, unless spell checking really isn't suitable for
the expected input, and it's an edge case where browser heuristics
typically guess wrongly.
I'd like to see some research done to find out exactly what kinds of
input authors use <input type="text">, <texarea> and contenteditable
for, beyond those already mentioned earlier in the thread. I'd also
like to see research into the <label>s, name="", id="" and other
identifying information, commonly given to such fields, which can be
used for developing heuristics.
Although accept="" is unlikely to be commonly used for textual input
these days, it would be useful to see research into the kind of
text-based content commonly entered (for which MIME types exist) that
browsers could use to improve their spell checking logic (e.g. ignoring
elements and attributes in textareas accepting text/html or XML).
More information about the whatwg