[asterisk-speech-rec] << Dual Mode IVR/ASR >>

jeff quade jjq90 at hotmail.com
Fri Mar 2 09:57:09 MST 2007


Perhaps someone can save me a bit of time with a conceptual issue:

Lets say I want to present a user the option to use either:

1) DTMF Mode (keypad) command entry
-- Loading no grammar
-- Essentially prompting with dialplan app: background(audioPrompt)
-- Interpreting the response within the **dialplan logic**

2) ASR Mode (voice) command entry
-- Loading a VOICE grammar
-- Essentially prompting with dialplan app: speechBackground(audioPrompt)
-- Interpreting the response within the **SpeechEngine**

-or-

3) DUAL Mode (in situations where ther is a low SNR)
-- Simultaneously Loading a DTMF grammar and a VOICE grammar
-- Essentially prompting with dialplan app: speechBackground(audioPrompt)
-- Interpreting the response within the **SpeechEngine**

Now I see from previous threads that there is a bit of confusion on how/why 
and what is returned from the SpeechEngine in #3 scenario above.

Am I right in assuming that if BOTH grammars are loaded in situation #3:

1) when DTMF is received by speechBackground()
-- The asterisk “connector” drops out of speech mode and simply accepts DTMF
-- The accepted DTMF is passed as as string of characters to the 
SpeechEngine
-- The Speech Engine simply uses the DTMF grammar for Semantic 
Interpretation
-- The semantic result is passed back in the  ${SPEECH_TEXT(0)}

2) when VOICE is received by speechBackground()
-- The asterisk connector relays AUDIO data to the SpeechEngine
-- The Speech Engine uses the VOICE grammar for Semantic Interpretation
-- The semantic result is passed back in the  ${SPEECH_TEXT(0)}

If the above is correct-- and Im looking for confirmation-- Now heres the 
issue:

Theoretically, you want to keep the SpeechEngine out of anything that can be 
done within the dialplan alone. Thereby freeing a Speechport for more 
“important” connections.

So as I see it-- by offering a DUAL mode command system you gain a benefit 
of:

1) Keeping all the semantic logic in one place (not scattered throught the 
dialplan)
-- ie: Within the loadable grammars

At the expense of:

1) Tying up a voicePort for trivial command interaction

Does that sound about right?

Cheers-
JJQ

_________________________________________________________________
Don’t miss your chance to WIN 10 hours of private jet travel from Microsoft® 
Office Live http://clk.atdmt.com/MRT/go/mcrssaub0540002499mrt/direct/01/



More information about the asterisk-speech-rec mailing list