[asterisk-dev] Speech Recognition

Fri Oct 13 08:22:44 MST 2006

Hello,

I've been playing with the Lumenvox ASR engine (http://www.lumenvox.com
<http://www.lumenvox.com/> ) and it works quite well with Asterisk.

However, I have noticed several problems with the Speech API:

(1)     When loading another grammar, it still appears to be processing
voice frames from a previous recognition cycle. This is noticeable as
SpeechBackground() is silent immediately when called without playing the
sound file and even after SpeechStart() is called again. I have been
able to correct this problem by introducing ast_clear_flag(speech,
AST_SPEECH_QUIET); in the ast_speech_start() function of
res/res_speech.c, as follows:

void ast_speech_start(struct ast_speech *speech)

{

        /* Clear any flags that may affect things */

        ast_clear_flag(speech, AST_SPEECH_SPOKE);

        ast_clear_flag(speech, AST_SPEECH_QUIET);

        /* If results are on the structure, free them since we are
starting again */

        if (speech->results != NULL) {

                            ast_speech_results_free(speech->results);

                speech->results = NULL;

        }

        /* If the engine needs to start stuff up, do it */

        if (speech->engine->start != NULL) {

                            speech->engine->start(speech);

        }

        return;

}

            Can someone confirm this as a reasonable fix and introduce
it into the development tree?

(2)     SpeechCreate() does not seem to report the correct status
(${SPEECH(status)}) when using the Lumenvox engine and there are no more
available licenses. This causes a subsequent call to
SpeechActivateGrammar() to drop the call. In fact, it doesn't make sense
to me at all to have any of the Speech...() functions return -1 and
cause the call to be hung up. This doesn't allow for any DTMF fallback
schemes. 

Can we have the Speech...() functions in apps/app_speech_utils.c set the
${SPEECH(status)} variable appropriately instead? Any comments?

(3)     SpeechBackground(Sound File|timeout) should treat a zero timeout
as meaning "timeout immediately after playing the sound file". This
allows you to call SpeechBackground() back-to-back without any delay.
Presently, a zero timeout means that it waits indefinitely for the
user's voice response. If no timeout is specified as a parameter then it
should behave with an indefinite timeout. 

Any comments?

Regards,

Stephan. 
--

Stephan A. Edelman, B.Eng.

NewAce Corporation

Toll Free: 1-877-463-9223 x221

International: +1 519 336 4837 x221 (Outside US & Canada)

Fax: +1 519 336 4046

Cell: +1 519 346 1581

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.digium.com/pipermail/asterisk-dev/attachments/20061013/cf8ca071/attachment.htm