[asterisk-dev] Methodologies for validating dialplan
Nikša Baldun
it at voxdiversa.hr
Tue Jan 4 16:13:13 CST 2022
Yeah, sorry, it's just that the experience is fresh in my mind, and I
wanted to help anyone avoid the pain I've been through. Whatever
anyone's preference is, I don't think difference in performance can be
disputed. It's not really C vs Python, it's dialplan interpreter vs
Python. Asterisk generates costly and unavoidable newexten event for
every executed line in dialplan. If there is a lot of them, which
obviously there will be in a large dialplan, performance suffers greatly.
On 04. 01. 2022. 22:15, asterisk at phreaknet.org wrote:
> Thanks for the advice - however, I personally like the dialplan and
> don't intend to stop using it. The dialplan/AGI/AEL/Lua (and now ARI)
> "config war" goes back a long time now, and I don't think it'll ever
> get resolved. That said, the vast majority of people *do* use
> extensions.conf dialplan, and I like it fine as a general approach.
> That's just my opinion, though.
>
> There *are* obvious limitations to the dialplan, which is what this
> helps to address - not to make the workflow perfect, but better. I'm
> not sure using AGI would really get around the underlying problem
> here... and C performs a lot better than Python does.
>
> On 1/4/2022 4:01 PM, Nikša Baldun wrote:
>> Hi,
>>
>> I apologize for not commenting on the actual issue. However, after
>> having the experience of writing a complex dialplan, I feel strongly
>> compelled to say that it shouldn't be done at all. Any non-trivial
>> call flow should be written in Fast AGI. I can't see any upside of
>> using extensions.conf or AEL. Using a real programming language is
>> considerably easier, faster and more powerful, all the necessary
>> tools already exist and most importantly, execution is significantly
>> faster. In my case, after rewriting my dialplan in Python, call
>> preparation time fell from 2.5 seconds to a mere 50 milliseconds.
>>
>> On 04. 01. 2022. 20:53, asterisk at phreaknet.org wrote:
>>> Hi, folks,
>>>
>>> Hope everyone's year is off to a good start. It was suggested on
>>> one of my code reviews to post here for discussion so here this is:
>>>
>>> The PBX core, when it parses the dialplan on reload, catches a small
>>> number of syntax errors, such as forgetting a trailing ) or priority
>>> number, things like that.
>>>
>>> However, there are a lot of dialplan problems that represent
>>> potentially valid syntax that will cause an error at runtime, such
>>> as branching to somewhere that doesn't exist. The dialplan will
>>> reload with no errors, since there isn't a syntax issue, but at
>>> runtime, the call will fail (and most likely crash). I found over
>>> the years that a lot of these were often simple typos or issues that
>>> were easily fixed but wasted a lot of time in finding solely in the
>>> "test, test, test" approach. Another common grievance I hear time to
>>> time about the dialplan is most issues are caught at runtime, not
>>> "compile time" (i.e. dialplan reload).
>>>
>>> One thing I've done to catch typos and syntax errors is run some
>>> scripts that try to validate my dialplan for me by using a number of
>>> regex-based scripts which scan the dialplan. Among other things,
>>> this finds branches to places that don't exist, unused/dead code in
>>> the dialplan that isn't referenced anywhere, attempts to play audio
>>> files that don't exist, etc. In doing so, we can catch an even
>>> greater percentage of these kinds of issues in advance, rather than
>>> sitting around and waiting for a fallthrough at runtime, then
>>> remedying the issue after it's already caused an issue.
>>>
>>> It works *okay* - this has helped A LOT in finding these problems
>>> before they are encountered at runtime, and finding problems I
>>> didn't even know existed - but it is *very* slow and probably takes
>>> 30 seconds to run on my dialplan (which is a few 10,000s of lines).
>>>
>>> To try to improve on this, I wrote a patch that adds the CLI
>>> commands 'dialplan analyze fallthrough' and 'dialplan analyze
>>> audio'. It scans the dialplan using Asterisk APIs and finds
>>> Goto/GotoIf/Gosub/GosubIf application calls that try to access a
>>> nonexistent location in the dialplan, and
>>> Playback/ControlPlayback/Read calls that try to play a file that
>>> doesn't exist. Instead of taking half a minute, it's essentially
>>> instantaneous. You can take a look at the patch/apply it from here:
>>> https://gerrit.asterisk.org/c/asterisk/+/17719
>>>
>>> There are obvious limitations to doing this; if variables are used
>>> in these calls, then it's very difficult - maybe impossible - to
>>> determine if something will fail just be crawling the config, so at
>>> the moment I ignore calls that contain variables in the relevant
>>> area. As such, there will be false negatives, but the goal is to not
>>> have false positives, and hopefully expose maybe the majority of
>>> issues that could be caught in advance in this manner.
>>>
>>> Right now, the patch adds some commands to the PBX core, which Josh
>>> suggested might not be the best way to do this additional level of
>>> verifying the dialplan and trying to preemptively find issues with
>>> it. For one, it relies on knowing the usage of different
>>> applications, not all of which are PBX builtins. It might be safe to
>>> say that the way to parse "Goto" or "Playback" in this case will not
>>> change. A suggestion was to expose a way for modules to define how
>>> they could be verified.
>>>
>>> I don't have any specific thoughts at the moment about how to
>>> proceed, but interested if anyone has any thoughts on what kind of
>>> architecture or approach here might make sense. Something to
>>> consider is that these validations may touch multiple different
>>> modules, maybe multiple times for the same module - and somehow this
>>> needs to be exposed to the PBX core for processing. For instance,
>>> the fallthrough check looks at Goto and Gosub, which are in
>>> completely different modules. Additionally, this is focused on the
>>> dialplan, meaning that running the rules in the module itself
>>> probably doesn't make any sense (but defining them there somehow
>>> might). However, ultimately there is an opportunity to preemptively
>>> find a lot of these issues in advance and improve the user
>>> experience, reduce frustration, etc.
>>>
>>> Thanks!
>>>
>>>
>>
More information about the asterisk-dev
mailing list