<div dir="ltr"><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Dec 11, 2013 at 5:22 PM, Ben Langfeld <span dir="ltr"><<a href="mailto:ben@langfeld.me" target="_blank">ben@langfeld.me</a>></span> wrote:<br>
<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">The vast majority of IVR platforms (mainly VoiceXML) permit handling DTMF in a consistent manner to speech recognition, that is by way of a DTMF grammar. Asterisk, to my knowledge, does not currently include an SRGS-based DTMF recognizer.<div>
<br></div><div>FreeSWITCH recently got one as part of mod_rayo, and Chris Rienzo has stated that he believes making a majority of this into a library general enough to be used in Asterisk also would be plausible, leaving limited integration effort in Asterisk.</div>
</div></blockquote><div><br></div><div>Today, Asterisk provides a module, res_speech, which acts as a generic speech detection engine interface. Asterisk uses this module as a way of providing to its various user interfaces - notably AGI and dialplan - a mechanism to manipulate any speech recognition engine that registers itself with the generic engine interface. In general, the overall module stack usually looks something like this (warning, bad ASCII art):<br>
<br></div><div><span style="font-family:courier new,monospace"> _____________ __________<br> | app_speech | | res_agi |<br></span></div><div><span style="font-family:courier new,monospace"> |_____________| |__________|<br>
|__________________|<br> _____|______<br> | |<br></span></div><div><span style="font-family:courier new,monospace"> | res_speech |<br></span></div><div>
<span style="font-family:courier new,monospace"> |____________|<br> |<br> |<br> ___________|_________<br> | |<br> | |<br>
_____________________ ________________<br></span></div><div><span style="font-family:courier new,monospace">| res_speech_lumenvox | | res_cepstral |<br>|_____________________| |________________|<br><br></span></div><div>
<ul><li>app_speech provides the dialplan application and uses res_speech to send commands/interface with a speech detection engine</li><li>res_agi does the same thing, only for the AGI interface</li><li>res_speech registers engine bridges and passes commands down to the speech engines bridges. This is the API that other things in Asterisk use to manipulate a speech recognition engine.</li>
<li>res_speech_lumenvox/res_cepstral are speech engine bridges that register themselves with res_speech and interface to those speech recognition engines. They do the actual work of informing the speech recognition engines of when to load the appropriate grammar, handle the start of audio being fed to the engine, etc.<br>
</li></ul></div><div>The res_speech module does have the capability to indicate DTMF to the engine bridges. Currently, this only happens from the SpeechBackground dialplan application. If a user presses a DTMF key, that DTMF is relayed directly to the engine interface for processing. It's a relatively simple call (ast_speech_dtmf) which passes a DTMF frame down to the engine. Whether or not the bridges actually do anything with it is up to them.<br>
<br></div><div>What speech recognition engine does mod_rayo/Chris interface to? I think Ben and/or Chris mentioned it at Adhearsion Conf - but this may be as straight forward as writing a bridge to that particular engine and passing the DTMF through to it, as well as deciding how (or if) there's a better way to interface with a speech engine through AMI/ARI.</div>
<div><br> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">
<div><br></div><div>It is interesting to me in order to simplify the implementation of Adhearsion atop Asterisk, since right now we have this in Ruby based on AMI DTMF events. Is there any appetite among the Asterisk core team to investigate the addition of this to Asterisk core or as a module?</div>
</div></blockquote><div><br></div><div>Yes!<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">
<div><br></div><div>If this gets agreement in principle, I'd love to talk more thoroughly about the kind of API that would be most useful for us, and how we can move forward with specification and implementation.</div>
</div>
<br clear="all"></blockquote><div><br></div><div>And I agree in principle :-)<br></div><div> </div></div>Matt<br><br></div><div class="gmail_extra">-- <br><div dir="ltr"><div>Matthew Jordan<br></div><div>Digium, Inc. | Engineering Manager</div>
<div>445 Jan Davis Drive NW - Huntsville, AL 35806 - USA</div><div>Check us out at: <a href="http://digium.com" target="_blank">http://digium.com</a> & <a href="http://asterisk.org" target="_blank">http://asterisk.org</a></div>
</div>
</div></div>