[whatwg] Considering a lang- attribute prefix for machine translation and intelligibility
Charles Pritchard
chuck at jumis.com
Wed May 2 12:01:14 PDT 2012
On 5/2/12 11:46 AM, Benjamin Hawkes-Lewis wrote:
> On Wed, May 2, 2012 at 6:59 PM, Charles Pritchard<chuck at jumis.com> wrote:
>>> If you do expect that, have you evaluated the existing mechanisms for
>>> embedding custom data in the page and found them wanting? If so, how?
>> 1. New features won't fix Google Translate bugs with existing
>> features, and it's more efficient for Google to fix Translate than for
>> the community to design, specify, and implement new features.
New features do allow services to coalesce around standards. That's what
the standards are here for.
HTML5 just added a translate attribute.
Span does not in and of itself signify any semantic meaning. Doesn't
that mean that Google Translate is operating correctly?
>> 2, 3, and 4: Given an appropriate vocabulary, existing mechanisms can
>> encode unambiguous meanings, information about how text should be
>> spoken, and phrase and sentence boundaries. Unicode describes
>> character boundaries.
Boris brought up that the concept of letter could use some attention:
http://lists.w3.org/Archives/Public/www-style/2011Nov/0055.html
Yes, we have existing XML mechanisms for text should be spoken.
What existing mechanism do we have for disambiguation?
>>
>> 5. Tab isn't talking about "data-" here, but about all the various
>> mechanisms available to provide custom data for services to consume
>> (e.g. microdata, microformats, RDFa).
Tab asked directly why data- does not work
Yes, we have a lot of microformats, it's true. And RDFa.
They don't seem to be taking flight for these issues, and language
translation seems like a high level issue appropriate for HTML. Again,
look at the translate and lang attributes; those are baked into HTML.
I am approaching the "lang-" proposal as language agnostic, much as
"aria-" is language agnostic.
This seems to be where we are currently:
<img lang="es" translate="no" alt="No" />
With alt having ARIA counterparts.
I'm suggesting a "lang-" with counterparts to translate, language code,
and a vastly enhanced vocabulary, much as ARIA vastly enhanced the UI
vocabulary. I think it could help in the long run.
-Charles
More information about the whatwg
mailing list