[asterisk-users] Run-away Asterisk

Yuan LIU yliu11 at hotmail.com
Wed Feb 28 22:13:41 MST 2007


>From: Tzafrir Cohen <tzafrir.cohen at xorcom.com>
>Date: Wed, 28 Feb 2007 22:28:32 +0200
>
>On Wed, Feb 28, 2007 at 10:56:14AM -0800, Yuan LIU wrote:
> > After testing some AGI's, I noticed several extra Asterisk processes.
>
>An agi script is run by the same user running asterisk, but is not
>"asterisk": it is a different program. What is the command name on those
>scripts?

Those were asterisk processes run as root. (I don't use a special user - 
should know better:-).

> > They
> > are not zombies, but can't be killed by safe_asterisk.
>
>safe_asterisk attempts (poorly) to guard asterisk. Not really to guard
>all of its child processes.
>
> > Nor will they die
> > when CLI issues stop now.  Then I read that each AGI spawns a separate
> > Asterisk process.
>
>Huh? AGI? FastAGI?

Plain AGI.  So Asterisk does not spawn children to handle AGI?  I read from 
some comments about Trixbox being all AGI, spawning Asterisk all the time, 
etc.

> > But all my AGI calls have apparently completed
> > successfully.  So there should be no reason for them to hang there.
> >
> > Several questions:
> >
> > 1) Under what conditions will an AGI hang a process? (My test scripts 
>are
> > pretty simple, almost directly derived from agi-test.agi.)
>
>An AGI may be an arbitrary subprocess. This subprocess can do basically
>everything. If it really wants to, (or if it misbehaves in the "right"
>way) it won't die.

Here's the script I tested:
#!/usr/bin/perl
# Receives a text string from channel, assigns to $ARGV[0]
# times out by $ARGV[1].

# USAGE: AGI(recvtext[,varname[,timeout]])
# default varname=recvtext, timeout=5000 ms

use strict;

$|=1;

# Setup some variables
my %AGI;

while(<STDIN>) {
        chomp;
        last unless length($_);
        if (/^agi_(\w+)\:\s+(.*)$/) {
                $AGI{$1} = $2;
        }
}

# print STDERR "AGI Environment Dump:\n";
# foreach my $i (sort keys %AGI) {
#       print STDERR " -- $i = $AGI{$i}\n";
# }

my $output = $ARGV[0]?$ARGV[0]:"recvtext";
my $timeout = $ARGV[1]?$ARGV[1]:5000; # default 5 sec
# print STDERR "3.  Testing 'receive text' in $timeout...";
print "RECEIVE TEXT $timeout\n";
chomp (my $res = <STDIN>);
if ($res =~ /^200/) {
        $res =~ /result=(-?\d+)/;
        if (!length($1)) {
                print STDERR "FAIL ($res)\n";
                exit 1;
        } else {
                print STDERR "PASS ($1)\n";
                $res =~ /\(([^)]+)/;
                my $var = $1;
                print "Set Variable $output \"$var\"\n";
        }
} else {
        print STDERR "FAIL (unexpected result '$res')\n";
        exit 2;
}
# end of script

I didn't notice THIS script hanging.  It was some time before I read the 
comments about Trixbox, and I couldn't reliably reproduce this now.  During 
the tests, I was changing timing parameters so some script had to go through 
timeouts in RECEIVE TEXT - from 4 sec to 6 sec, and end up receiving 
nothing.  But even in these cases, I could see from CLI that they completed.

> > 2) How to detect run-away processes under 2.4 kernels?  In this kernel,
> > each thread clusters process space and it's very difficult to 
>distinguish
> > them without killing the main process.
>
>hmm, please attach the output of:
>
>ps auxww | grep asterisk

Here's one snapshot:
root     18234  0.0  0.2  2252 1052 ?        S    Feb27   0:00 /bin/sh 
/usr/sbin/safe_asterisk
root     18236  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18237  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18239  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18240  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18241  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18242  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18243  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18244  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18245  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18246  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18247  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18248  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18249  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18250  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18251  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18252  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18253  0.0  1.4 16608 7176 ?        S    Feb27   0:44 
/usr/sbin/asterisk -vvvg
root     18254  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18255  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18256  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18257  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18258  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18259  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18260  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18261  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18262  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg
root     18263  0.0  1.4 16608 7176 ?        S    Feb27   0:00 
/usr/sbin/asterisk -vvvg

Again, I'm not sure if this list contains any run-away processes because I 
couldn't reproduce the problem.  At this time, the Asterisk under the 2.6 
kernel (the one performed the test set with the Asterisk under 2.4 kernel) 
does not display spurious Asterisk processes.

As the above showed, the AGI script is pretty simple, and a direct adaption 
of agi-test.agi.  I tried to feed it all kinds of inputs under a shell and 
it wouldn't hang or die abnormally by itself.  Of course, when I observed 
those hanging processes (from the 2.6 kernel) the Asterisk had gone through 
quite a bit of up time - and a few "restart now" cycles before that.  So 
they may not be AGI related.

Yuan Liu

> > 3) Any practical way to detect them from inside Asterisk - e.g., do some
> > check after each AGI call?  All my AGISTATUS reports success.  I could 
>use
> > System() but isn't that cumbersome?
>
>Write/use better code, I guess.
>
>--
>                Tzafrir Cohen
>icq#16849755                    jabber:tzafrir at jabber.org
>+972-50-7952406           mailto:tzafrir.cohen at xorcom.com
>http://www.xorcom.com  iax:guest at local.xorcom.com/tzafrir




More information about the asterisk-users mailing list