[whatwg] Web API for speech recognition and synthesis

Thu Dec 3 11:31:53 PST 2009

I was not thinking of raw access to the mic. I was just thinking of a 2
step method to do it so you could just do 1 step :)

I was thinking of something like:

        1. Call Sound API and ask to record (maybe something like the
        geolocation on Firefox [1]).

        2. Pass it to text2speech or save or stream or whatever..

This way one could record audio and do something else like save/stream.
If other want to translate into text, just do the next step.

[1]: http://www.mozilla.com/en-US/firefox/geolocation/

-- 
Diogo Resende <dresende at thinkdigital.pt>
ThinkDigital

On Thu, 2009-12-03 at 12:30 -0500, Fergus Henderson wrote:
> On Thu, Dec 3, 2009 at 7:32 AM, Diogo Resende
> <dresende at thinkdigital.pt> wrote:
>         I agree 100%. Still, I think the access to the mic and the
>         speech
>         recognition could be separated.
> 
> While it would be possible to separate access to the microphone and
> speech recognition, combining them allows the API to abstract away
> details of the implementation that would otherwise have to be exposed,
> in particular the audio encoding(s) used, and whether the audio is
> streamed to the recognizer or sent in a single chunk.  If we don't
> provide general access to the microphone, the speech recognition API
> can be simpler, implementors will have more flexibility, and
> implementations can be simpler and smaller because they won't have to
> deal with conversions between different audio encodings.
> 
> So I'm in favour of not separating out access to the microphone, at
> least in v1 of the API.
> 
> -- 
> Fergus Henderson <fergus at google.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20091203/5b351073/attachment-0002.pgp>