[Asterisk-Users] Check and restart script..

Garry Adkins gpa2 at netacs.net
Thu Sep 25 19:53:29 MST 2003


> I still feel that some kind of heartbeat check on asterisk itself would be
> best...  Does anything like that exist now?  Something that an external
> process can check with asterisk and verify that all FXS/FXO/IAX2?/SIP/H323
> processess are running normally?

I agree...  As a unix support person at work, I find that I have to write
these types of watchdogs often...  Sometimes an application will partially
fail, or fail but not exit, ending up as some zombie.  (I've tried the
ps -auxw, and it's not smart enough to see a program has hung...  and your
load average is now about 80...)

I generally do two things... Try to implement a "heartbeat" of sorts, and
also watch for load average running very high.  The latter just pages me...
As the program can still be running, just weirdly slow.

Maybe this is a dumb idea, but here's what sprang to mind....

Generate a call by dropping a file in the outbound directory,
It calls a special extension that is only available to it's own context
(i.e. no one else can call it)
The extension calls AGI script which resets a watchdog timer. (maybe touches
a file)
If call doesn't come in certain amt of time (i.e. timestamp > 5 mins old),
watchdog restarts asterisk, killing it with a vengance if necessary (-9)
Pause for 2 or 3x the loop time (this keeps it from getting restarted while
it's still coming back up, if it takes a while)

While thinking about it, you could have a single program (run from init
perhaps) that:
1) Watches for the timestamp being older than x minutes
2)  Generates a file to drop in the outbound directory every few minutes
3)  Reads a config file on each loop so you can change the settings (i.e.
turn it off, lengthen time of each loop)

It's important to be able to stop the watchdog easily, as it makes it hard
to do maintenence when the watchdog keeps bringing the server back up....
:-)

-G





More information about the asterisk-users mailing list