[whatwg] Web API for speech recognition and synthesis
bringert at google.com
Fri Dec 11 06:05:00 PST 2009
Thanks for the discussion - cool to see more interest today also
recognition and synthesis. It adds a navigator.speech object with
void listen(ListenCallback callback, ListenOptions options);
void speak(DOMString text, SpeakCallback callback, SpeakOptions options);
The implementation uses an NPAPI plugin for the Android browser that
wraps the existing Android speech APIs. The code is available at
There are some simple demo apps in
- English to Spanish speech-to-speech translation
- Google search by speaking a query
- The obligatory pizza ordering system
- A phone number dialer
On Fri, Dec 4, 2009 at 2:51 PM, Olli Pettay <Olli.Pettay at helsinki.fi> wrote:
> Indeed the API should be something significantly simpler than X+V.
> Microsoft has (had?) support for SALT. That API is pretty simple and
> provides speech recognition and TTS.
> The API could be probably even simpler than SALT.
> IIRC, there was an extension for Firefox to support SALT (well, there was
> also an extension to support X+V).
> If the platform/OS provides ASR and TTS, adding a JS API for it should
> be pretty simple. X+V tries to handle some logic using VoiceXML FIA, but
> I think it would be more web-like to give pure JS API (similar to SALT).
> Integrating visual and voice input could be done in scripts. I'd assume
> there would be some script libraries to handle multimodal input integration
> - especially if there will be touch and gestures events too etc. (Classic
> multimodal map applications will become possible in web.)
> But this all is something which should be possibly designed in or with W3C
> multimodal working group. I know their current architecture is way more
> complex, but X+X, SALT and even Multimodal-CSS has been discussed in that
> working group.
> On 12/3/09 2:50 AM, Dave Burke wrote:
>> We're envisaging a simpler programmatic API that looks familiar to the
>> modern Web developer but one which avoids the legacy of dialog system
>> On Wed, Dec 2, 2009 at 7:25 PM, João Eiras <joaoe at opera.com
>> <mailto:joaoe at opera.com>> wrote:
>> On Wed, 02 Dec 2009 12:32:07 +0100, Bjorn Bringert
>> <bringert at google.com <mailto:bringert at google.com>> wrote:
>> We've been watching our colleagues build native apps that use
>> APIs that let us do the same in web apps. We are thinking about
>> creating a lightweight and implementation-independent API that lets
>> web apps use speech services. Is anyone else interested in that?
>> Bjorn Bringert, David Singleton, Gummi Hafsteinsson
>> This exists already, but only Opera supports it, although there are
>> problems with the library we use for speech recognition.
>> Would be nice to revive that specification and get vendor buy-in.
>> João Eiras
>> Core Developer, Opera Software ASA, http://www.opera.com/
Google UK Limited, Registered Office: Belgrave House, 76 Buckingham
Palace Road, London, SW1W 9TQ
Registered in England Number: 3977902
More information about the whatwg