[asterisk-speech-rec] Sphinx and AGI integration; Digium vs VoIP gate

praveen kumar pbx.kumar at gmail.com
Mon Dec 21 03:42:55 CST 2009


Sphinx was never written keeping telephony applications in mind. Last
time I checked, they had WSJ models for 8k but they did a pretty bad
job.

Each port from Nuance costs $500-2000 + maintenance depending on the
grammar size.

So, expect speech to be expensive.

Good luck.

On Sun, Dec 20, 2009 at 6:05 AM, johny jj2 <johnyjj2 at gmail.com> wrote:
> Thank you for your answer!
>
> I watched video-tutorials of Lumenvox and contacted them.
> Unfortunately their services are much too expensive. I guess similar
> thing can be said about Invox.
>
> In other words I need to create this system on my own. May you rather
> answer my original questions, please?
>
> Let me summarize those:
> 1. How to connect Asterisk with Automatic Speech Recognition? I
> created formal grammars, algorithm in source code of java application
> for CMU Sphinx4 ASR, acoustic model and so on. I found there are two
> ways of integrating Sphinx with Asterisk:
> http://www.voip-info.org/wiki/view/Sphinx and
> http://scribblej.com/svn/ . But later I found that most of these
> things are done with dial-plan. Do I need to use this Sphinx4 at all?
> Or do I only need acoustic model for my language created with
> SphinxTrain?
> 2. I'd like the user to be able to choose if he/she wants to use DTMF
> or ASR in the given session. I thought that it should be like: a) user
> chooses with DTMF what he/she wants to use, b) based on this decision
> it switches to DTMF main algorithm or ASR main algorithm. How to do
> this?
>
> Regards!
>
> 2009/12/17 praveen kumar <pbx.kumar at gmail.com>:
>> Hi -
>>
>> You have more than one possibility
>>
>> - you can use Lumenvox and buy their licenses and write a program to do that
>>
>> - The other option I can recommend is to try www.invox.com
>> (Intelligent Voice) and build your phone system there. Since it
>> integrates with REST, HTML - you can simply collect information from
>> caller using dtmf and/or speech and then post these params to the REST
>> page which will output the sum. You can collect the output and use TTS
>> to play it back. It should be 5-10 mins to build this system. The
>> system is hosted - so you don't have to worry about renting servers
>> etc. You pay per call or per min depending on the plan.
>>
>>
>> Thanks.
>>
>> On Thu, Dec 17, 2009 at 12:55 PM, johny jj2 <johnyjj2 at gmail.com> wrote:
>>> Hello!
>>>
>>> I would be very grateful if you can answer my questions, at least with
>>> one sentence :-). Or simply answer e.g. "1-3, 2-1, 3-2" (it is my
>>> choice at this moment) and give short explanation :-).
>>>
>>> I'm familiar with using SphinxTrain and Sphinx4. I'd like to create
>>> such an IVR-ASR system that:
>>> a. user calls special number
>>> b. he or she speaks twelve digits
>>> c. server recognizes digits, calculates control sum and inform the
>>> user about this sum
>>> d. second and third steps are repeated many times until the user says 'finish'
>>>
>>> There are some things which I should consider:
>>>
>>> ---------------------------------------------------------------------------------------
>>> 1. HOW TO ENABLE ACCESS TO ASTERISK FROM MOBILE PHONE (choice of
>>> hardware and services)
>>> keywords: server, Digium card, SIP/ITSP provider, PSTN/DID number
>>> ---------------------------------------------------------------------------------------
>>>
>>> I've got server with access to internet. Unfortunately this server
>>> runs on Windows (but I try my best to convince its admin to switch to
>>> Linux and I may succeed). What should I buy for this server? I thought
>>> about:
>>>
>>> 1-1. http://www.planet.com.tw/en/product/product_ov.php?id=4160 (price
>>> about 230 euro)
>>> 1-2. Digium card (I don't know approximate prices)
>>> 1-3. buying service from SIP provider (what may be the prices of such
>>> a service?)
>>> 1-4. or should I rent server?
>>>
>>> Ad. 1-2:
>>>
>>> I asked companies from my country and only two providers answered me.
>>>
>>> First one (HaloNet) told me that in order to configure Asterisk for
>>> HaloNet I need: 1. account (https://www.halonet.pl/rejestracja), 2.
>>> password to account, 3. name for SIP server (sip.halonet.pl).
>>> Additionally, to test incoming calls, I need PSTN number. They told me
>>> to register for the service and then send mail to them with request to
>>> add test number. They also provided examplary configuration for
>>> Asterisk. How to create or obtain my name for SIP server?
>>>
>>> Second one (Ipfon) told me to 1. create an account
>>> (https://rejestrator.ipfon.pl/index.php?version=ipfon_starter&scenario=telefon),
>>> 2. configure trunk for Asterisk
>>> (http://forum.ipfon.pl/index.php?topic=64).
>>>
>>> I also asked on Ekiga mailing list (it is not form my country;
>>> http://mail.gnome.org/archives/ekiga-list/2009-December/msg00046.html).
>>> They told that they cannot provide what I need. They told about ITSP
>>> (not SIP) providers and DID (not PSTN) number. I thought I understand
>>> that I need PSTN number from SIP provider. They told I need DID number
>>> from ITSP provider and I'm really confused. So what do I need exactly?
>>>
>>> After all I guess it would work like this: user -> mobile phone ->
>>> call -> servers of providers -> network cloud -> my server ->
>>> Asterisk. Am I right?
>>>
>>> Ad 1-4:
>>>
>>> At first I thought about using server which they can provide me.
>>> Access to physical, proprietary device would be necessary for 1-1 and
>>> 1-2. However for 1-3 I can consider both options (to have my own
>>> server or to rent server from somebody else). It is popular thing to
>>> buy some space on server to upload webpage. Are there similar services
>>> for what I'd like to do? In other words I need Linux server with
>>> Asterisk and probably Sphinx. The disadvantage of my server is that
>>> I've got Windows and perhaps I will have to use Asterisk in Windows
>>> (however it is not a sure thing, there is possiblity that I would be
>>> able to convince administrator to switch to Linux).
>>>
>>> --------------------------------------------------------------------------------
>>> 2. HOW TO ENABLE SPEECH RECOGNITION ON SERVER WITH ASTERISK (choice of software)
>>> keywords: AGI scripts, Sphinx4, ScribbleJ plugin, PocketSphinx
>>> --------------------------------------------------------------------------------
>>>
>>> 2-1. I found this: http://www.voip-info.org/wiki/view/Sphinx . It is
>>> AGI script to be called from Asterisk. Am I right that the only what I
>>> need is Asterisk and Sphinx4?
>>> 2-2. I found this: http://scribblej.com/svn/ . What kind of advantage
>>> does it have if it looks like the same can be done much easier with
>>> 2-1? For this solution it would look like: Asterisk <-> ScribbleJ
>>> plugin <-> Sphinx4 (if it is possible to integrate it with Sphinx4, it
>>> was tested only for PocketSphinx).
>>> 2-3. Are there any other ways possible?
>>>
>>> ------------------------------------------------------------------------------------
>>> 3. WHERE TO SPECIFY ALGORITHM? (Asterisk + Sphinx or Asterisk +
>>> AGI/AEL/LUA scripts)
>>> ------------------------------------------------------------------------------------
>>>
>>> I am also curious about the way how to specify the algorithm of the talk.
>>>
>>> 3-1. Formal grammars and source code for Sphinx4 application
>>>
>>> At first I thought about writing application for Sphinx4. The
>>> application is written in java, normally executed as "java -mx256m
>>> -jar bin/ApplicationName.jar". I create: a) acoustic model (it is not
>>> English and it cannot be downloaded from VoxForge so I had to create
>>> it myself in SphinxTrain), b) language model (created with lmtoolkit
>>> online), c) formal grammars (it is crucial for the algorithm), d) list
>>> of words, list of phonemes, e) main application (java source code). I
>>> create (a) from (b) and (d) and then I use (c) and (a) for (e).
>>>
>>> 3-2. Dialplan with AEL/LUA script
>>>
>>> But later I talked a little bit on #asterisk at Freenode. (I installed
>>> Pidgin in order to contact ScribbleJ, author of the plugin, but I
>>> couldn't contact him after all). They told me "Implement the logic in
>>> the dialplan. Or if you choose to use an embedded language like AEL or
>>> LUA". So don't I need java source code from Sphinx4 at all? Do I need
>>> to have installed Sphinx4 at all :-)? May you give me link to some
>>> kind of tutorial about creating these dialplans? Do I still need
>>> formal grammars from 3-1?
>>>
>>> Thanks very much for help in advance :-)!
>>> Greetings!
>
> _______________________________________________
> --Bandwidth and Colocation Provided by http://www.api-digital.com--
>
> asterisk-speech-rec mailing list
> To UNSUBSCRIBE or update options visit:
>   http://lists.digium.com/mailman/listinfo/asterisk-speech-rec
>



More information about the asterisk-speech-rec mailing list