From nshmyrev at gmail.com  Sun May 24 05:41:14 2020
From: nshmyrev at gmail.com (Nickolay Shmyrev)
Date: Sun, 24 May 2020 13:41:14 +0300
Subject: [asterisk-speech-rec] Module for Vosk speech recognition in
	Asterisk available
Message-ID: <CD95F3FF-C183-46E7-A531-5FE861F868D5@gmail.com>

Hey, it is the right time to revive this list.

We have just released a module for speech recognition in Asterisk with Vosk server:

https://github.com/alphacep/vosk-asterisk

[Vosk server](https://github.com/alphacep/vosk-server) is an open source speech recognition server which supports several protocols (websocket, grpc). You can install Vosk server with a simple docker and transcribe speech in English, Chinese or Russian like this:

docker run -d -p 2700:2700 alphacep/kaldi-en:latest

Other models like Spanish are also available on request. Other nice things about Vosk:

• Implements very accurate speech recognition with modern neural networks, much more accurate than pocketsphinx or any other public ASR toolkits (those are usually trained for wideband and do not work for telephony).

• Provides streaming API for the best user experience, you can actually process partial results and give users instant answers.

• Allows quick reconfiguration of vocabulary and grammars for the best accuracy.

• Supports speaker identification beside simple speech recognition.

Unlike Unimrcp, Vosk server doesn't have much to configure and works over simple websocket protocol.

It is also possible to forward audio to AMI/ARI/AGI and process audio from the separate web application, but in a long term you'll have to recreate all asterisk on Statis by yourself, so we don't consider it as a relevant way to implement the voice interface.

In a long term, the best way to implement user input with the natural user experience is asynchronous processing of the input. And asynchronous processing requires something event-based and more complicated than current asterisk speech API. So we might implement more complex modules for speech in the future.

The module integrates Vosk with Asterisk Speech API, so the dialplan integration is really easy:

[internal]
exten = 1,1,Answer
same = n,Wait(1)
same = n,SpeechCreate
same = n,SpeechBackground(hello)
same = n,Verbose(0,Result was ${SPEECH_TEXT(0)})

We also have a module for Freeswitch

Comments, opinions. and test reports are welcome.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 204 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.digium.com/pipermail/asterisk-speech-rec/attachments/20200524/0a219e71/attachment.sig>