[Asterisk-Dev] Re: Asterisk and Sphinx integration

Tue Mar 22 13:10:52 MST 2005

Hello John,

I'll see if I can create a few diffs and post this on the site you suggested.

The integration wasn't that straightforward: Sphinx's real-time decoder uses 16KHz 16-bit PCM, whereas Asterisk provides a feed through the EAGI interface at 8KHz. I just used a crude average between successive samples to double up on the sampling rate.

I modified the livepretend program (for which I'll provide the patches) included in the Sphinx distribution to create a daemon that listens on a specific port and creates a thread for each connection to deal with the recognition process.

I had to add silence detection code to start and stop the recognition process and have the Sphinx software produce a hypothesis of the speech. I then send the hypothesis result to Asterisk and close the connection.

I only had to modify the "eagi-sphinx-test.c" sample program provided in the Asterisk distribution to change the IP address and port to the one used by the Sphinx daemon, described above.

I had amazing results using synthesized speech (not surprising, I suppose) using the Cepstral voice for Linux (William) (I use this in conjuction with Asterisk to create on-the-fly customized voice prompts. I wrote a little wrapper that allows me to call it from within a dialplan). I created some test samples and had Sphinx recognize this. I repeatedly got near perfect recognition. 

The live tests by dialing into Asterisk weren't as successful. It looks like some amplitude normalization / Automatic Gain Control code may need to be added. I'll play with that over the next little while to see if I can improve things.

I've experimented with IBM's ViaVoice for Linux a few years ago (yes, I know they no longer sell or support this) and got better results than what I am getting with Sphinx at the moment. If I can dig up a copy of ViaVoice, I'll try integrating it too and see how well it performs.

Regards,

Stephan.

At 8:35 AM -0500 on 3/22/05, Stephan A. Edelman wrote:

>Hello all,

>

>I've integrated Asterisk (CVS) and Sphinx 3.5 with a bit of hacking. 

>I have it working correctly but only for a locally connected POTS phone 

>on the FXS port of a TDM4400P (with 2 FXS / 2 FXO modules).

>

>When I call into Asterisk over the PSTN, the recognition rate is very 

>poor or non-existent. I've determined that this is because the 

>amplitude of the audio data fed to Sphinx is very low (at least 5dB 

>down compared to when a POTS call is made).

>

>Are there any AGC adjustments or otherwise manual volume adjustments 

>that can be made on the hardware?

>

>My other question concerns the Sphinx recognition database. It appears 

>it only recognizes, ABC...Z (spelled out), numbers, months, YES, NO, 

>etc.

>

>Does anyone have a more elaborate database that includes CUSTOMER 

>SERVICE, SUPPORT, SALES, etc.

>

>Any pointers would be greatly appreciated.

>

>Regards,

>

>Stephan.

>

Stephan -

While this is not a direct answer to your question, could you post 

your methods and/or patches onto the voip-info.com wiki? I, and 

many other people, have been trying to get Sphinx installed correctly at all, and I'm sure once we got it working that more dictionaries would appear simply by virtue of more people working on the project.

JT

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.digium.com/pipermail/asterisk-dev/attachments/20050322/5797b342/attachment.htm