[asterisk-bugs] [Asterisk 0014625]: Random deadlocks leading to lockup on 1.4.23.1

Asterisk Bug Tracker noreply at bugs.digium.com
Mon Mar 23 18:49:11 CDT 2009


A NOTE has been added to this issue. 
====================================================================== 
http://bugs.digium.com/view.php?id=14625 
====================================================================== 
Reported By:                acunningham
Assigned To:                
====================================================================== 
Project:                    Asterisk
Issue ID:                   14625
Category:                   General
Reproducibility:            random
Severity:                   major
Priority:                   normal
Status:                     feedback
Asterisk Version:           1.4.23 
Regression:                 No 
SVN Branch (only for SVN checkouts, not tarball releases): N/A 
SVN Revision (number only!):  
Request Review:              
====================================================================== 
Date Submitted:             2009-03-08 19:36 CDT
Last Modified:              2009-03-23 18:49 CDT
====================================================================== 
Summary:                    Random deadlocks leading to lockup on 1.4.23.1
Description: 
Random channels seem to be giving deadlocks on a 1.4.23.1 system:

# tail -f /var/log/asterisk/full | grep -i deadlock
[Mar  8 20:35:37] DEBUG[4249] channel.c: Avoiding initial deadlock for
channel '0x8779688'
[Mar  8 20:35:37] DEBUG[4249] channel.c: Avoiding initial deadlock for
channel '0x87ff698'
[Mar  8 20:35:42] DEBUG[4249] channel.c: Avoiding initial deadlock for
channel '0x8314468'
[Mar  8 20:35:46] DEBUG[4249] channel.c: Avoiding initial deadlock for
channel '0x861ddd0'


I've also seen "Avoiding deadlock" as well as "Avoiding initial deadlock",
though these are less common. The system is continually busy at medium load
(tens of calls). Over a period of days, the problem gets worse and worse
until Asterisk stops handling calls and needs to be killed with "kill -9".
Compiling with DEBUG_THREADS and running "core show locks" always shows:

=======================================================================
=== Currently Held Locks ==============================================
=======================================================================
===
=== <file> <line num> <function> <lock name> <lock addr> (times locked)
===
=======================================================================

with no further output. I've tried the patch in ticket
http://bugs.digium.com/view.php?id=13116 but this
doesn't help.

I'm not sure what debugs would be useful, but if someone can advise what
would be useful I'm happy to take them.
====================================================================== 

---------------------------------------------------------------------- 
 (0102111) acunningham (reporter) - 2009-03-23 18:49
 http://bugs.digium.com/view.php?id=14625#c102111 
---------------------------------------------------------------------- 
Asterisk stopped accepting connections on the console:

asterisk0:~# asterisk -rx 'core show locks'
asterisk0:~# asterisk -r
Asterisk 1.4.23.1, Copyright (C) 1999 - 2008 Digium, Inc. and others.
Created by Mark Spencer <markster at digium.com>
Asterisk comes with ABSOLUTELY NO WARRANTY; type 'core show warranty' for
details.
This is free software, with components licensed under the GNU General
Public
License version 2 and other licenses; you are welcome to redistribute it
under
certain conditions. Type 'core show license' for details.
=========================================================================
Connected to Asterisk asterisk/asterisk.ctl currently running on No more
connections allowed
 (pid = 0)
 No more connections allowed
 *CLI>
 Disconnected from Asterisk server

Asterisk needed to be killed with "kill -9". It didn't respond to a plain
kill. During the problem, calls continued to be handled correctly but it's
my suspicion (based on previous instances) that had we let it run longer
call processing would have stopped.

I will shortly be uploading 3 files taken before restarting asterisk:

1. gdb3.txt: Output of gdb with "thread apply all bt".

2. ps1.txt: Output of "ps -ef".

3. fd1.txt: Output of "ls -l" in /proc/<asterisk pid>/fd. 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2009-03-23 18:49 acunningham    Note Added: 0102111                          
======================================================================




More information about the asterisk-bugs mailing list