<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Exchange Server">
<!-- converted from rtf -->
<style><!-- .EmailQuote { margin-left: 1pt; padding-left: 4pt; border-left: #800000 2px solid; } --></style>
</head>
<body>
<font face="Consolas, monospace" size="2">
<div>Currently the W3C Device API WG is working on a Capture API which will include microphone capture and audio streaming capabilities. The current draft is at: <a href="http://dev.w3.org/2009/dap/camera/"><font color="#0000FF"><u>http://dev.w3.org/2009/dap/camera/</u></font></a></div>
<div> </div>
<div>It is pretty rough and still in working progress, so for instance streaming is not there. </div>
<div> </div>
<div>Thanks</div>
<div>Dzung Tran</div>
<div> </div>
<div>On Sun, Dec 13, 2009 at 6:46 PM, Ian McGraw <<a href="mailto:imcgraw@mit.edu"><font color="#0000FF"><u>imcgraw@mit.edu</u></font></a>> wrote:</div>
<div>> I'm new to this list, but as a speech-scientist and web developer, I wanted</div>
<div>> to add my 2 cents. ?Personally, I believe the future of speech recognition</div>
<div>> is in the cloud.</div>
<div>> Here are two services which provide Javascript APIs for speech recognition</div>
<div>> (and TTS) today:</div>
<div>> <a href="http://wami.csail.mit.edu/"><font color="#0000FF"><u>http://wami.csail.mit.edu/</u></font></a></div>
<div>> <a href="http://www.research.att.com/projects/SpeechMashup/index.html"><font color="#0000FF"><u>http://www.research.att.com/projects/SpeechMashup/index.html</u></font></a></div>
<div>> Both of these are research systems, and as such they are really just</div>
<div>> proof-of-concepts.</div>
<div>> That said, Wami's JSONP-like implementation allows Quizlet.com to use speech</div>
<div>> recognition today on a relatively large scale, with just a few lines of</div>
<div>> Javascript code:</div>
<div>> <a href="http://quizlet.com/voicetest/415/?scatter"><font color="#0000FF"><u>http://quizlet.com/voicetest/415/?scatter</u></font></a></div>
<div>> Since there are a lot of Google folks on this list, I recommend you talk to</div>
<div>> Alex Gruenstein (in your speech group) who was one of the lead developers of</div>
<div>> WAMI while at MIT.</div>
<div>> The major limitation we found when building the system was that we had to</div>
<div>> develop a new audio controller for every client (Java for the desktop,</div>
<div>> custom browsers for iPhone and Android). ?It would have been much simpler if</div>
<div>> browsers came with standard microphone capture and audio streaming</div>
<div>> capabilities.</div>
<div>> -Ian</div>
<div>></div>
<div><font face="Calibri, sans-serif" size="2"> </font></div>
<div><font face="Calibri, sans-serif" size="2"> </font></div>
</font>
</body>
</html>