[whatwg] Web API for speech recognition and synthesis

Tue Dec 15 16:43:46 PST 2009

Currently the W3C Device API WG is working on a Capture API which will include microphone capture and audio streaming capabilities. The current draft is at: http://dev.w3.org/2009/dap/camera/

It is pretty rough and still in working progress, so for instance streaming is not there.

Thanks
Dzung Tran

On Sun, Dec 13, 2009 at 6:46 PM, Ian McGraw <imcgraw at mit.edu<mailto:imcgraw at mit.edu>> wrote:
> I'm new to this list, but as a speech-scientist and web developer, I wanted
> to add my 2 cents. ?Personally, I believe the future of speech recognition
> is in the cloud.
> Here are two services which provide Javascript APIs for speech recognition
> (and TTS) today:
> http://wami.csail.mit.edu/
> http://www.research.att.com/projects/SpeechMashup/index.html
> Both of these are research systems, and as such they are really just
> proof-of-concepts.
> That said, Wami's JSONP-like implementation allows Quizlet.com to use speech
> recognition today on a relatively large scale, with just a few lines of
> Javascript code:
> http://quizlet.com/voicetest/415/?scatter
> Since there are a lot of Google folks on this list, I recommend you talk to
> Alex Gruenstein (in your speech group) who was one of the lead developers of
> WAMI while at MIT.
> The major limitation we found when building the system was that we had to
> develop a new audio controller for every client (Java for the desktop,
> custom browsers for iPhone and Android). ?It would have been much simpler if
> browsers came with standard microphone capture and audio streaming
> capabilities.
> -Ian
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20091215/f21dd4ac/attachment-0002.htm>