[whatwg] Web API for speech recognition and synthesis

Diogo Resende dresende at thinkdigital.pt
Wed Dec 2 11:40:11 PST 2009


If you're able to read from the mic, you don't need to upload. You could
save it locally (for example for voice memos). The read+upload was just
2 steps I sugested instead of direct streaming. Speech recognition could
be done separatly. One could use the mic to capture a voice note. Other
could use the speech recognition without the mic (saved file?). Divide
and conquer :)

-- 
Diogo Resende <dresende at thinkdigital.pt>
ThinkDigital

On Wed, 2009-12-02 at 19:17 +0000, Bjorn Bringert wrote:
> I agree that being able to capture and upload audio to a server would
> be useful for a lot of applications, and it could be used to do speech
> recognition. However, for a web app developer who just wants to
> develop an application that uses speech input and/or output, it
> doesn't seem very convenient, since it requires server-side
> infrastructure that is very costly to develop and run. A
> speech-specific API in the browser gives browser implementors the
> option to use on-device speech services provided by the OS, or
> server-side speech synthesis/recognition.
> 
> /Bjorn
> 
> On Wed, Dec 2, 2009 at 6:23 PM, Diogo Resende <dresende at thinkdigital.pt> wrote:
> > I missunderstood too. It would be great to have the ability to access
> > the microphone and record+upload or stream sound to the web server.
> >
> > --
> > D.
> >
> >
> > On Wed, 2009-12-02 at 10:04 -0800, Jonas Sicking wrote:
> >> On Wed, Dec 2, 2009 at 9:17 AM, Bjorn Bringert <bringert at google.com> wrote:
> >> > I think that it would be best to extend the browser with a JavaScript
> >> > speech API intended for use by web apps. That is, only web apps that
> >> > use the speech API would have speech support. But it should be
> >> > possible to use such an API to write browser extensions (using
> >> > Greasemonkey, Chrome extensions etc) that allow speech control of the
> >> > browser and speech synthesis of web page contents. Doing it the other
> >> > way around seems like it would reduce the flexibility for web app
> >> > developers.
> >>
> >> Hmm.. I guess I misunderstood your original proposal.
> >>
> >> Do you want the browser to expose an API that converts speech to text?
> >> Or do you want the browser to expose access to the microphone so that
> >> you can do speech to text convertion in javascript?
> >>
> >> If the former, could you describe your use cases in more detail?
> >>
> >> / Jonas
> >
> 
> 
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20091202/bf89b0a0/attachment-0002.pgp>


More information about the whatwg mailing list