[asterisk-bugs] [Asterisk 0014625]: Random deadlocks leading to lockup on 1.4.23.1

Asterisk Bug Tracker noreply at bugs.digium.com
Mon Mar 30 12:27:05 CDT 2009


A NOTE has been added to this issue. 
====================================================================== 
http://bugs.digium.com/view.php?id=14625 
====================================================================== 
Reported By:                acunningham
Assigned To:                tilghman
====================================================================== 
Project:                    Asterisk
Issue ID:                   14625
Category:                   General
Reproducibility:            random
Severity:                   major
Priority:                   normal
Status:                     acknowledged
Asterisk Version:           1.4.23 
Regression:                 No 
SVN Branch (only for SVN checkouts, not tarball releases): N/A 
SVN Revision (number only!):  
Request Review:              
====================================================================== 
Date Submitted:             2009-03-08 19:36 CDT
Last Modified:              2009-03-30 12:27 CDT
====================================================================== 
Summary:                    Random deadlocks leading to lockup on 1.4.23.1
Description: 
Random channels seem to be giving deadlocks on a 1.4.23.1 system:

# tail -f /var/log/asterisk/full | grep -i deadlock
[Mar  8 20:35:37] DEBUG[4249] channel.c: Avoiding initial deadlock for
channel '0x8779688'
[Mar  8 20:35:37] DEBUG[4249] channel.c: Avoiding initial deadlock for
channel '0x87ff698'
[Mar  8 20:35:42] DEBUG[4249] channel.c: Avoiding initial deadlock for
channel '0x8314468'
[Mar  8 20:35:46] DEBUG[4249] channel.c: Avoiding initial deadlock for
channel '0x861ddd0'


I've also seen "Avoiding deadlock" as well as "Avoiding initial deadlock",
though these are less common. The system is continually busy at medium load
(tens of calls). Over a period of days, the problem gets worse and worse
until Asterisk stops handling calls and needs to be killed with "kill -9".
Compiling with DEBUG_THREADS and running "core show locks" always shows:

=======================================================================
=== Currently Held Locks ==============================================
=======================================================================
===
=== <file> <line num> <function> <lock name> <lock addr> (times locked)
===
=======================================================================

with no further output. I've tried the patch in ticket
http://bugs.digium.com/view.php?id=13116 but this
doesn't help.

I'm not sure what debugs would be useful, but if someone can advise what
would be useful I'm happy to take them.
====================================================================== 

---------------------------------------------------------------------- 
 (0102412) tilghman (administrator) - 2009-03-30 12:27
 http://bugs.digium.com/view.php?id=14625#c102412 
---------------------------------------------------------------------- 
Also, fd3 indicates that at the time you had 107 calls running, there were
actually 209 channels (which consume 418 file descriptors for the channel
structure alone, not counting RTP streams) active.  Given that each call is
typically composed of 2 channels, this is not out of line with the expected
results (if all calls were connected, you would have consumed 214 channels
or 428 file descriptors).  Indeed, you had 210 RTP sockets at the time,
which is also not out of line with expected results.  I also see 105 FDs
for UDPTL (used for T.38 setup).  Just there alone, you would be consuming
743 file descriptors for calls alone.  Do you see how close you are getting
to the 1024 maximum limit?

With about 30 calls more, you'll run into that limit, probably causing a
cascade failure, seeing exactly the symptoms you're witnessing. 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2009-03-30 12:27 tilghman       Note Added: 0102412                          
======================================================================




More information about the asterisk-bugs mailing list