[Asterisk-Dev] Like a Heartbeat for *

Sun Sep 14 17:04:43 MST 2003

>Hello,
>
>>  two small scripts to verify that * runs ok (doing ps -ax, querying * thru
>>  manager 5038, etc). Now, I think that it's time to build a unique heartbeat
>>  application, isn't it?
>
>I think it is an important point to make * reliable. But there are 
>differents sort of things to probe in *:
>-channels are the most important one I think (probe that * is 
>running and is able to accept calls through sip/iax or h323 channel 
>per example)
>-another thing important is to be sure that all applications are 
>working (the other day I had all the system down due to too many 
>file handles open, so all AGI were not working)
>-another one is to probe if it is not caused by one of the provider 
>(a link like E1 or internet provider is down...).
>
>For the moment I don't know if the system has to be maid from the 
>inside of *, or if it should be an outside program, but some probes 
>can only be maid from the inside like checking the apps.
>
>Miro

I disagree that any testing needs to be done from within Asterisk. 
This is an opinion, but here's why I say that:

A) If your application is primarily the connection of calls between 
point A and point B, then you could create an automatic dialing 
program that checked for a tone between points A and B, as if they 
were a "normal" dialing customer.  Measure call completion via some 
out-of-band IP-based tool - perhaps fire off a UDP packet at the 
other end, or send an file, or something.  You might even have a 
remote phone ringer hooked up to a switch to a d/a converter - make 
it as fancy or as simple as you want.  If a regularly scheduled call 
fails, set off an alarm.

B) If your application is primarily a database or scripted 
application that is "internal" to Asterisk, then create triggers 
inside of your scripts/dialplan that communicate with your monitoring 
system at certain points.  Then, create a "dummy" user like in method 
A that calls in to your production system and performs certain tasks 
at a scheduled time.  If while your dummy user is going through the 
system and the triggers are not being fired, then you have a problem. 
Set off an alarm.

The nice thing about A and B is that you can use Asterisk to perform 
a large portion of the testing, since Asterisk can mimic a standard 
user fairly accurately.  The only thing it can't do is detect certain 
responses on the RTP channel to verify that a call sounds "good". 
This would require some sort of MOS toolkit at each endpoint. 
(Alternately: RTCP anyone?  Anyone?  Anyone?  Bueller?)

While I understand the desire to having a fully-integrated 
"monitoring" system inside of Asterisk sounds good, I also very 
rarely find that self-monitoring applications that don't melt down in 
a way that defeats their own self-monitoring.  External measurement 
from the user's perspective is always a better way to monitor.

Now, if you'd asked about "measurement" of values within Asterisk, 
that is a different story...

JT