[asterisk-users] Simple, fast single-word offline free speech recognition in Asterisk (or as an AGI)?

Jonathan H lardconcepts at gmail.com
Mon Nov 25 23:31:01 CST 2019


I'm after fast, native recognition of the numbers 1 to 20, yes, no, menu
and help.

At the moment, I use Google Speech Recognition which uses no local
processing power, and is very accurate, allowing me to run on a very low
end VPS.

However, with the minimum of 15 seconds, numbers and words like "yes, no"
soon eat up the 60 minute free allowance.

I was hoping I could use "local", with a fallback to Google speech rec if
it was uncertain.

Any ideas? Thanks

Yes, I know I post similar back in January, but there was no response back
then and I was hoping things might have changed :)

On Wed, 16 Jan 2019 at 17:42, Jonathan H <lardconcepts at gmail.com> wrote:

> When I last looked into this a couple of years ago, simple one-word speech
> recognition was rather complex and slow.
>
> At the moment, I use Google Speech Recognition which uses no local
> processing power, and is very accurate and fast, allowing me to run on a
> very low end VPS.
>
> However, with the minimum of 15 seconds, numbers and words like "yes, no"
> soon eat up the 60 minute free allowance.
>
> Have things changed much in the last couple of years? I see a couple of
> new "standalone" projects even from the likes of Facebook and Mozilla, but
> they require a degree in C++ and, apparently, about 24 hours to build a
> voice model on a high-end box with the latest graphics cards (for the
> number crunching). Also, unless I'm reading it wrong, each second of speech
> takes 4 seconds to recognise on a low end machine with this standalone
> offerings and similar ones.
>
> https://github.com/facebookresearch/wav2letter
> https://voice.mozilla.org/en
>
> In fact, come to think of it, I really only need offline fast recognition
> of numbers 1 to 20, yes, no, menu and help.
> For voicemail transcription I'm happy to stick with Google's paid service
> as it's remarkably accurate with phone quality speech (beats Microsoft and
> Amazon Transcribe hands down from what I can tell).
>
> Oh, and UniMRPC seems rather complex and the licensing doesn't suit - 99%
> of the time I have one channel (caller) but it can jump to 10 - I don't
> want to have to buy a 10 channel license for that 1 hour a month!
>
> Any ideas? Thanks
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-users/attachments/20191126/e1834e91/attachment.html>


More information about the asterisk-users mailing list