<html><head><meta http-equiv="Content-Type" content="text/html charset=windows-1252"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">Excuse my somewhat tardy reply to this thread, but since you brought up AMD:<br><div apple-content-edited="true">
<div style="color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; border-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; border-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; border-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; border-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div><br></div></div></div></span></div></span></div></span></div></span></div></div><div><div>On Jun 16, 2014, at 11:47 AM, Ben Langfeld <<a href="mailto:ben@langfeld.me">ben@langfeld.me</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><div class="h5"><div class="gmail_extra"><div class="gmail_quote">On Sun, Jun 15, 2014 at 9:24 PM, Krandon<span class="Apple-converted-space"> </span><span dir="ltr"><<a href="mailto:krandon.bruse@gmail.com" target="_blank">krandon.bruse@gmail.com</a>></span><span class="Apple-converted-space"> </span>wrote:<br><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div><span style="font-size: 14px;">Hello Asterisk friends,</span></div><div><span style="font-size: 14px;"><br></span></div><div><span style="font-size: 14px;">I am currently interfacing with Asterisk through ARI and loving the experience so far. I have successfully originated calls and dumped them into my Stasis app. I am trying to figure out what the best way is to send a channel into an Application. The current architecture for /channels/{id}/play works well for the majority of my app, but I am running into a block figuring out how to interact with Asterisk dialplan applications.</span></div><div><span style="font-size: 14px;"><br></span></div><div><span style="font-size: 14px;">To give an example - I submit an originate to go to SIP/vendor/phoneNumber - with the other leg going to App: myStasisApp, {"soundFile":"blah"}. That works fine (with the proper quote escaping). Now my Stasis app has received the channelID to which we can do a lot of neat stuff. Say I play a sound to the user but then want to call the app WaitForSilence. What's the best way to do this? I may be misinterpreting the intended use of both Stasis and ARI - but I am curious to see what your thoughts are.</span></div><div><span style="font-size: 14px;"><br></span></div><div><span style="font-size: 14px;">Also, for the stasis app to get a list of arguments, I am passing it through as JSON. So far that is working fine - but I wanted to see if there was a better way to get a list/array of app args to Stasis.</span></div><div><span style="font-size: 14px;"><br></span></div><div><span style="font-size: 14px;">Forgive me if there is an easy solution - through digging and poking the last few days, I have not been able to find the intended use case or even a use case.</span></div><div><span style="font-size: 14px;"><br></span></div><br></blockquote></div><br></div></div></div><div class="gmail_extra" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">Well, the solution for this just got added into the Asterisk 12 branch, and so it hasn't made it into a release yet. It should be coming soon in Asterisk 12.4.0.<br><br></div><div class="gmail_extra" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">The TALK_DETECT [1] function enables AMI/ARI events [2] [3] [4] [5] on a channel, such that a connected ARI application receives notifications over the WebSocket when a person starts/stops talking. This lets you asynchronously 'know' when both talking/silence has occurred - obviating the need for the WaitForSilence/WaitForNoise dialplan applications. Plus, because it is asynchronous, if you decide you don't *want* to wait for silence, you don't have to!<br><br></div><div class="gmail_extra" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">With a bit of manipulation, you could also construct AMD from this as well - but I'll admit that's a bit more challenging. I'd be interested in people's experiences with attempting to do that, and if an asynchronous "IS_HUMAN" detection function is needed or not.<br></div></blockquote><div><br></div><div>We are in the process right now of creating an application that needs asynchronous AMD. Specifically, we are implementing LumenVox’s CPA product[1] and the use case is this:</div><div><br></div><div>* Reminder call is placed to recipient</div><div>* Recipient answers (don’t yet know if it is a human or a machine)</div><div>* Outgoing message begins to play</div><div>* If a human is detected, stop playback and connect to an agent</div><div>* If a machine is detected, keep playing back until…</div><div>* If a beep is detected, stop and restart playback</div><div><br></div><div>The only way to achieve this is if we can have an async speech recognizer running while simultaneously playing output, which isn’t possible with Dialplan today, and would require a specialized app even if it were implemented that way. Instead, we are hoping to have a lower-level primitive to do signals detection and playback asynchronously.</div><div><br></div><div>In an ideal world, ARI would provide primitives for playback (file or TTS) and input (DTMF or ASR). Some more background from discussion related to our project, courtesy Ben Langfeld:</div><div><br></div><div><p style="box-sizing: border-box; margin: 15px 0px; color: rgb(51, 51, 51); font-family: Helvetica, arial, freesans, clean, sans-serif; font-size: 14px; line-height: 23.799999237060547px; background-color: rgb(255, 255, 255); position: static; z-index: auto;">The asynchronous example is more complex. While Adhearsion sees both the input and output components as being asynchronous, this is a fake facility provided by Punchblock to make Asterisk look like an async server when it is not. Both components are implemented atop synchronous Asterisk dialplan applications:</p><p style="box-sizing: border-box; margin: 15px 0px; color: rgb(51, 51, 51); font-family: Helvetica, arial, freesans, clean, sans-serif; font-size: 14px; line-height: 23.799999237060547px; background-color: rgb(255, 255, 255);">For output: <code style="box-sizing: border-box; font-family: Consolas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 12px; margin: 0px; border: 1px solid rgb(221, 221, 221); border-top-left-radius: 3px; border-top-right-radius: 3px; border-bottom-right-radius: 3px; border-bottom-left-radius: 3px; padding: 0px; background-color: rgb(248, 248, 248);">Playback()</code> or <code style="box-sizing: border-box; font-family: Consolas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 12px; margin: 0px; border: 1px solid rgb(221, 221, 221); border-top-left-radius: 3px; border-top-right-radius: 3px; border-bottom-right-radius: 3px; border-bottom-left-radius: 3px; padding: 0px; background-color: rgb(248, 248, 248);">MRCPSynth()</code><br style="box-sizing: border-box;">For input: <code style="box-sizing: border-box; font-family: Consolas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 12px; margin: 0px; border: 1px solid rgb(221, 221, 221); border-top-left-radius: 3px; border-top-right-radius: 3px; border-bottom-right-radius: 3px; border-bottom-left-radius: 3px; padding: 0px; background-color: rgb(248, 248, 248);">MRCPRecog()</code></p><p style="box-sizing: border-box; margin: 15px 0px; color: rgb(51, 51, 51); font-family: Helvetica, arial, freesans, clean, sans-serif; font-size: 14px; line-height: 23.799999237060547px; background-color: rgb(255, 255, 255);">This means that given the simplest approach to implementation discussed above, the output would be executed, followed by the input being queued and executed once the output had completed. If we were to swap the two, not only would we now have a coordination problem where we have to queue cancellation of the output to paper over the race condition introduced by potentially being asked to stop it before we have a handle on it, we would have the same blocking problem with <code style="box-sizing: border-box; font-family: Consolas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 12px; margin: 0px; border: 1px solid rgb(221, 221, 221); border-top-left-radius: 3px; border-top-right-radius: 3px; border-bottom-right-radius: 3px; border-bottom-left-radius: 3px; padding: 0px; background-color: rgb(248, 248, 248);">MRCPRecog()</code>.</p><p style="box-sizing: border-box; margin: 15px 0px; color: rgb(51, 51, 51); font-family: Helvetica, arial, freesans, clean, sans-serif; font-size: 14px; line-height: 23.799999237060547px; background-color: rgb(255, 255, 255);">So that rules out combining one of the UniMRCP dialplan applications with the <code style="box-sizing: border-box; font-family: Consolas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 12px; margin: 0px; border: 1px solid rgb(221, 221, 221); border-top-left-radius: 3px; border-top-right-radius: 3px; border-bottom-right-radius: 3px; border-bottom-left-radius: 3px; padding: 0px; background-color: rgb(248, 248, 248);">Playback()</code> application in this fashion. There are two other remaining solutions that come to mind:</p><ol class="task-list" style="box-sizing: border-box; padding: 0px 0px 0px 30px; margin: 15px 0px; color: rgb(51, 51, 51); font-family: Helvetica, arial, freesans, clean, sans-serif; font-size: 14px; line-height: 23.799999237060547px; background-color: rgb(255, 255, 255);"><li style="box-sizing: border-box;"><p style="box-sizing: border-box; margin: 15px 0px;"><strong style="box-sizing: border-box;">A prompt command</strong> to combine the output and input into a single dialplan application invocation (<code style="box-sizing: border-box; font-family: Consolas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 12px; margin: 0px; border: 1px solid rgb(221, 221, 221); border-top-left-radius: 3px; border-top-right-radius: 3px; border-bottom-right-radius: 3px; border-bottom-left-radius: 3px; padding: 0px; background-color: rgb(248, 248, 248);">MRCPRecog()</code> for native file playback, <code style="box-sizing: border-box; font-family: Consolas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 12px; margin: 0px; border: 1px solid rgb(221, 221, 221); border-top-left-radius: 3px; border-top-right-radius: 3px; border-bottom-right-radius: 3px; border-bottom-left-radius: 3px; padding: 0px; background-color: rgb(248, 248, 248);">SynthAndRecog()</code> for TTS). This avoids the problem of multiple dialplan applications blocking one another, but introduces a fresh one: these applications terminate output as soon as recognition completes (or earlier if barge-in is enabled). There is no opportunity to inject logic to filter the recognition result prior to terminating the output, nor do I think this would make sense.</p></li><li style="box-sizing: border-box;"><p style="box-sizing: border-box; margin: 15px 0px;"><strong style="box-sizing: border-box;">The Asterisk Speech API</strong> (<code style="box-sizing: border-box; font-family: Consolas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 12px; margin: 0px; border: 1px solid rgb(221, 221, 221); border-top-left-radius: 3px; border-top-right-radius: 3px; border-bottom-right-radius: 3px; border-bottom-left-radius: 3px; padding: 0px; background-color: rgb(248, 248, 248);">SpeechLoadGrammar()</code>, <code style="box-sizing: border-box; font-family: Consolas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 12px; margin: 0px; border: 1px solid rgb(221, 221, 221); border-top-left-radius: 3px; border-top-right-radius: 3px; border-bottom-right-radius: 3px; border-bottom-left-radius: 3px; padding: 0px; background-color: rgb(248, 248, 248);">SpeechActivateGrammar()</code>, <code style="box-sizing: border-box; font-family: Consolas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 12px; margin: 0px; border: 1px solid rgb(221, 221, 221); border-top-left-radius: 3px; border-top-right-radius: 3px; border-bottom-right-radius: 3px; border-bottom-left-radius: 3px; padding: 0px; background-color: rgb(248, 248, 248);">SpeechStart()</code>,<code style="box-sizing: border-box; font-family: Consolas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 12px; margin: 0px; border: 1px solid rgb(221, 221, 221); border-top-left-radius: 3px; border-top-right-radius: 3px; border-bottom-right-radius: 3px; border-bottom-left-radius: 3px; padding: 0px; background-color: rgb(248, 248, 248);">SpeechBackground()</code>, etc). If <code style="box-sizing: border-box; font-family: Consolas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 12px; margin: 0px; border: 1px solid rgb(221, 221, 221); border-top-left-radius: 3px; border-top-right-radius: 3px; border-bottom-right-radius: 3px; border-bottom-left-radius: 3px; padding: 0px; background-color: rgb(248, 248, 248);">SpeechBackground()</code> this would be the obvious solution, but it unfortunately is not. <code style="box-sizing: border-box; font-family: Consolas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 12px; margin: 0px; border: 1px solid rgb(221, 221, 221); border-top-left-radius: 3px; border-top-right-radius: 3px; border-bottom-right-radius: 3px; border-bottom-left-radius: 3px; padding: 0px; background-color: rgb(248, 248, 248);">SpeechBackground()</code> actually sits in a loop, directing audio frames to the recognizer while simultaneously rendering frames of audio (the first option is a file path). The app does not return until recognition has completed, so cannot be combined with <code style="box-sizing: border-box; font-family: Consolas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 12px; margin: 0px; border: 1px solid rgb(221, 221, 221); border-top-left-radius: 3px; border-top-right-radius: 3px; border-bottom-right-radius: 3px; border-bottom-left-radius: 3px; padding: 0px; background-color: rgb(248, 248, 248);">Playback()</code>. Upon recognition completion, the output will be terminated, regardless of the recognition result, so this suffers the same problem as Rayo Prompt. It is also not possible to use any other output renderer, such as a TTS engine via MRCP.</p></li></ol><div><blockquote style="box-sizing: border-box; margin: 15px 0px; border-left-width: 4px; border-left-style: solid; border-left-color: rgb(221, 221, 221); padding: 0px 15px; color: rgb(119, 119, 119); font-family: Helvetica, arial, freesans, clean, sans-serif; font-size: 14px; line-height: 23.799999237060547px; background-color: rgb(255, 255, 255);"><ul class="task-list" style="box-sizing: border-box; padding: 0px 0px 0px 30px; margin: 0px;"><li style="box-sizing: border-box;">Can we implement Asterisk/Lumenvox CPA in way to be compatible with the adhearsion-cpa controller methods API?</li></ul></blockquote><p style="box-sizing: border-box; margin: 15px 0px; color: rgb(51, 51, 51); font-family: Helvetica, arial, freesans, clean, sans-serif; font-size: 14px; line-height: 23.799999237060547px; background-color: rgb(255, 255, 255); position: static; z-index: auto;">The problems stated above leave us with only one option: extra capability must be introduced to Asterisk in order to handle simultaneous dialplan applications, or to introduce a true async version of <code style="box-sizing: border-box; font-family: Consolas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 12px; margin: 0px; border: 1px solid rgb(221, 221, 221); border-top-left-radius: 3px; border-top-right-radius: 3px; border-bottom-right-radius: 3px; border-bottom-left-radius: 3px; padding: 0px; background-color: rgb(248, 248, 248);">SpeechBackground()</code>. The viability of this is something that must be discussed with the Asterisk project / Digium. Note that FreeSWITCH already has this capability, but would also need less invasive changes to cope with LumenVox CPA as stated above; a far more approachable task.</p><p style="box-sizing: border-box; margin: 15px 0px; color: rgb(51, 51, 51); font-family: Helvetica, arial, freesans, clean, sans-serif; font-size: 14px; line-height: 23.799999237060547px; background-color: rgb(255, 255, 255); position: static; z-index: auto;">In short, the adhearsion-cpa API can be honoured for the synchronous detection case trivially. It cannot be honoured for the async case, nor can any equivalent alternative be introduced, without changes to Asterisk.</p></div></div><div><br></div><div>[1]: <a href="http://www.lumenvox.com/products/speech_engine/cpa.aspx">http://www.lumenvox.com/products/speech_engine/cpa.aspx</a></div><div><br></div><div>/BAK/</div><div><br></div><div><div><div style="orphans: 2; text-align: -webkit-auto; widows: 2; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><span class="Apple-style-span" style="border-collapse: separate; border-spacing: 0px;"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><span class="Apple-style-span" style="border-collapse: separate; text-align: -webkit-auto; border-spacing: 0px;"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><span class="Apple-style-span" style="border-collapse: separate; text-align: -webkit-auto; border-spacing: 0px;"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><span class="Apple-style-span" style="border-collapse: separate; border-spacing: 0px;"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><div><div>-- </div><div>Ben Klang</div><div>Principal/Technology Strategist, Mojo Lingo</div><div><a href="mailto:bklang@mojolingo.com">bklang@mojolingo.com</a></div><div>+1.404.475.4841</div><div><br></div><div>Mojo Lingo -- <i>Voice applications that work like magic</i></div><div><a href="http://mojolingo.com/">http://mojolingo.com</a></div></div><div>Twitter: @MojoLingo</div><div><br></div></div></span></div></span></div></span></div></span></div></div></div><br><blockquote type="cite"><div class="gmail_extra" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br>[1]<span class="Apple-converted-space"> </span><a href="https://wiki.asterisk.org/wiki/display/AST/Asterisk+12+Function_TALK_DETECT" target="_blank">https://wiki.asterisk.org/wiki/display/AST/Asterisk+12+Function_TALK_DETECT</a><br>[2]<span class="Apple-converted-space"> </span><a href="https://wiki.asterisk.org/wiki/display/AST/Asterisk+12+ManagerEvent_ChannelTalkingStart" target="_blank">https://wiki.asterisk.org/wiki/display/AST/Asterisk+12+ManagerEvent_ChannelTalkingStart</a><br>[3]<span class="Apple-converted-space"> </span><a href="https://wiki.asterisk.org/wiki/display/AST/Asterisk+12+ManagerEvent_ChannelTalkingStop" target="_blank">https://wiki.asterisk.org/wiki/display/AST/Asterisk+12+ManagerEvent_ChannelTalkingStop</a><br>[4]<a href="https://wiki.asterisk.org/wiki/display/AST/Asterisk+12+REST+Data+Models#Asterisk12RESTDataModels-ChannelTalkingStarted" target="_blank">https://wiki.asterisk.org/wiki/display/AST/Asterisk+12+REST+Data+Models#Asterisk12RESTDataModels-ChannelTalkingStarted</a><br>[5]<a href="https://wiki.asterisk.org/wiki/display/AST/Asterisk+12+REST+Data+Models#Asterisk12RESTDataModels-ChannelTalkingFinished" target="_blank">https://wiki.asterisk.org/wiki/display/AST/Asterisk+12+REST+Data+Models#Asterisk12RESTDataModels-ChannelTalkingFinished</a><br><br></div><div class="gmail_extra" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">Matt<span class="HOEnZb"><font color="#888888"><br><br></font></span></div><span class="HOEnZb" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><font color="#888888">-- </font></span></blockquote></div><br></body></html>