[asterisk-biz] Asterisk 1.4.22.1 and zombies

Steve Murphy murf at digium.com
Fri Jan 16 10:01:07 CST 2009


Sabine--

   see notes below

> Hello,
> 
> We are running Asterisk-1.0.12 in a CentOS 4-4.2 system, kernel
> 2.6.9-42.0.3.ELsmp.
> 
> We have some custom AGI, and when we launch Asterisk the system works fine.
> 
> But **after some time**, each AGI execution generates a zombie <defunct> process.
> 
> We believe that it's not a problem in the AGI code, because Asterisk+AGI is
> working fine in the first "n" minutes/hours. This is a pstree sample:

> 
> init-+-asterisk---asterisk-+-28*[asterisk]
>      |                     |-asterisk-+-21*[xxxxxxxxx.agi]
>      |                     |          `-40*[xxxxxxxxx.agi]
>      |                     |-5*[asterisk-+-yyyyyyyyy.agi]
>      |                     |             |-zzzzzzzzz.agi]
> (...)
> 
> Each agi is a defunct process. It dies when the call (parent) finishes.
> 
> When the first zombie appears, then ALL next AGI launched from Asterisk
> generates a zombie.
> 
> We have tested some improvements to solve the problem, with no success:
> 
> - Upgrade from RedHat 8 to Centos 3.x
> - Upgrade from Centos 3.x to Centos 4.x
> - LD_ASSUME_KERNEL=2.4.1
> - ulimit -n 65535
> - Upgrade from asterisk 1.0.7 to 1.0.12
> 
> Currenly we can not easily migrate from asterisk-1.0.x to 1.2.x
> 
> Any ideas?. Could be Debian a solution?
> 
> Thank you.

On Fri, 2009-01-16 at 10:10 +0100, Sabine Jordan wrote:
> Hello all,
> 
> I've read the following posting at lists.digium.com from November 2006:
> 
> http://lists.digium.com/pipermail/asterisk-users/2006-November/172261.html
> 
> I know it's been a while ago and unfortunately there where no replies.
> Maybe someone has solved the problem anyway and could help me, because
> we have the same problem running asterisk 1.4.22.1. We already updated
> from a previous version, but the problem remains the same...
> 
> We have some php-scripts from where we run asterisk-applications which
> seem to work fine for a while until the first zombie appears. After the
> first zombie has appeared all other hangups seem to generate zombies as
> well. After a while the zombies disappear until it all starts again. But
> there are times when we have more than 1500 zombies which is alarming.
> 
> Some informmation about the script:
> 
> - every db-connection is closed
> - the script is closed with an "exit"
> - script works fine until after some time the first zombie appears
> 
> pstree |grep asterisk
> 
>      |-safe_asterisk---asterisk-+-173*[asterisk.php]
>      |                          `-37*[{asterisk}]
> 
> 
> Has someone solved the problem in the past and/or could give me a hint?
> 
> Thank you in advance.



Hung channel problems have been fairly challenging to fix, especially
the ones like yours, that don't happen all the time.

It's good you updated to 1.4; after reading the letter you 
referenced from the archives, these statements concerned me:

        "We are running Asterisk-1.0.12 in a CentOS 4-4.2 system, kernel
        2.6.9-42.0.3.ELsmp.
                
        "Currenly we can not easily migrate from asterisk-1.0.x to 1.2.x
        

The reason no-one probably responded to your first message, is your
asterisk version was so ancient, they could only conclude that the 
problem might be solved in the last 3 or so years.

Oh, and I'd not use the word "zombie"; it is pretty descriptive, but
in the core of asterisk, the word is being used for a channel that
has been masqueraded. Masquerading is a process that duplicates a
channel
and gives almost all the active components, including its name to the
new channel. The old channel gets a name like "Sip/1-1<ZOMBIE>".
This happens with parking, xfers, etc.

It might help to do these sorts of things when you are seeing lots
of hung channels:

1. "core show threads"
2. "core show channels"
3  "core set verbose 10"
4  "core set debug 10"
5. sip show channels   (if you are using sip channels)
6. sip show history.... on a hung channel

You didn't mention what kind of channels you were involved;
Dahdi (zap) or Sip or what?

If everything goes fine and then, all on a sudden, you start
accumulating dead/hung channels, you have to ask yourself,
what might be going on that would explain the buildup? What
is different?  It might be a network difficulty, a DNS server
problem, a database system hiccup, or some other non-obvious
problem that might come, and go away. Look for warning and
error messages in system logs, /var/log/asterisk/messages,
and other system logs.

Turning up verbosity will show you what applications are
being executed in the dialplan. If you see a lot of 
an app being executed, but not the following one, 
then you might be hanging in that app... and so on.

If you are serious about solving a problem, you have
to put on your detective hat and pull out your notebook
and start experimenting, probing, and testing. Not all
problems are because of bugs in Asterisk!

murf


-- 
Steve Murphy <murf at digium.com>
Digium
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3227 bytes
Desc: not available
Url : http://lists.digium.com/pipermail/asterisk-biz/attachments/20090116/ed82006a/attachment.bin 


More information about the asterisk-biz mailing list