[asterisk-app-dev] Asterisk and UniMRCP Licensing
Joshua Colp
jcolp at digium.com
Fri Sep 5 12:29:56 CDT 2014
Ben Klang wrote:
<snip>
>
> Is it really required to use res_speech? If so, can we change the
> interfaces that ARI presents?
>
> Over the last few years we’ve evaluated res_speech vs. the various
> UniMRCP applications (SynthAndRecog primarily). We’ve always come to the
> conclusion that the res_speech API either couldn’t give us what we
> needed, or was not as performant. SynthAndRecog isn’t perfect, but it
> does a couple of crucial things, perhaps most importantly is the
> combined lifecycle of TTS + ASR so that you can “barge” into a TTS
> playback before it is finished.
The res_speech module and API is a very thin wrapper over common speech
recognition concepts. It does some helpful stuff like handling
transcoding and having a state machine but otherwise it relies on the
underlying speech technology to do everything. It doesn't provide
anything to the dialplan 'nor does it even know about channels.
What you probably found limiting was the interface provided to the
dialplan/AGI for speech recognition, with the dialplan applications
taking care of things. These wouldn't get used in ARI. We're free to
make the interface there whatever we want.
During lunch though I gave this some more thought and think that speech
recognition should always be a passive action on a channel (or heck, a
bridge). It would sit in the media path feeding stuff to the speech
recognition and raising events but does not block. This would allow it
to easily cooperate with every other possible thing in ARI without
requiring a developer to use a Snoop channel and manage it. It also
doesn't put the "well if they start speaking what do I do" logic inside
of Asterisk - it gives that power to the developer.
Thoughts?
--
Joshua Colp
Digium, Inc. | Senior Software Developer
445 Jan Davis Drive NW - Huntsville, AL 35806 - US
Check us out at: www.digium.com & www.asterisk.org
More information about the asterisk-app-dev
mailing list