[asterisk-bugs] [Asterisk 0016432]: racecondition leading to deadlock in chan_local

Asterisk Bug Tracker noreply at bugs.digium.com
Wed Jan 13 05:13:26 CST 2010


A NOTE has been added to this issue. 
====================================================================== 
https://issues.asterisk.org/view.php?id=16432 
====================================================================== 
Reported By:                fnordian
Assigned To:                
====================================================================== 
Project:                    Asterisk
Issue ID:                   16432
Category:                   Channels/chan_local
Reproducibility:            random
Severity:                   major
Priority:                   normal
Status:                     acknowledged
Asterisk Version:           SVN 
JIRA:                       SWP-528 
Regression:                 No 
Reviewboard Link:            
SVN Branch (only for SVN checkouts, not tarball releases): 1.6.1 
SVN Revision (number only!): 230432 
Request Review:              
====================================================================== 
Date Submitted:             2009-12-11 10:12 CST
Last Modified:              2010-01-13 05:13 CST
====================================================================== 
Summary:                    racecondition leading to deadlock in chan_local
Description: 
Hi,

there's a race in chan_local between local_hangup and queuing a dtmf
(ast_senddigit ... local_queue_frame), which can cause hanging channels and
a hanging ast_senddigit.

This is possible because local_digit_*-functions assume that they were
called on an unlocked channel, which is not always the cases, e.g.
manager_play_dtmf uses a locked channel to call ast_senddigit. With this
wrong assumption, deadlock_avoidance code is not called.
====================================================================== 

---------------------------------------------------------------------- 
 (0116563) skochen (reporter) - 2010-01-13 05:13
 https://issues.asterisk.org/view.php?id=16432#c116563 
---------------------------------------------------------------------- 
I believe we're seeing the same problem on Asterisk 1.4.28, in a slightly
different situation. We recently reinstalled with DEBUG_THREADS to get a
better idea of what is going on. In "core show locks", I now see the
following:

=== Thread ID: 88636304 (pbx_thread           started at [ 2643] pbx.c
ast_pbx_start())
=== ---> Lock https://issues.asterisk.org/view.php?id=0 (chan_local.c): MUTEX
559 local_hangup &ast->lock
0x9551850 (1)
=== ---> Tried and failed to get Lock https://issues.asterisk.org/view.php?id=1
(chan_local.c): MUTEX 588
local_hangup &p->chan->lock 0x962f250 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 52378512 (pbx_thread           started at [ 2643] pbx.c
ast_pbx_start())
=== ---> Lock https://issues.asterisk.org/view.php?id=0 (chan_local.c): MUTEX
201 local_queue_frame &us->lock
0x962f250 (1)
=== ---> Tried and failed to get Lock https://issues.asterisk.org/view.php?id=1
(chan_local.c): MUTEX 195
local_queue_frame &other->lock 0x9551850 (0)
=== -------------------------------------------------------------------

This is the first time I'm lock debugging in Asterisk, but this looks like
a deadlock to me. The same output contains a second, similar case.
Eventually, lots of things calling channel_find_locked get caught up in it
as well, causing chan_sip to become unresponsive, and the CLI to hang on
"core show channels" for example.

When not caught up in the deadlock while it is happening, the CLI may log
nothing at all, or may log lots of:
channel.c: Exceptionally long queue length queuing to
Local/403 at default-a54d,1

Backtraces are provided below (also without DONT_OPTIMIZE). It looks like
one thread hangs when app_queue calls hangupcalls() to hang up all it's
agents. The agents are Local channels in this setup. The other thread hangs
because one of those channels is apparently getting answered in the same
split second.

=== Thread ID: 88636304 (pbx_thread           started at [ 2643] pbx.c
ast_pbx_start())
=== ---> Lock https://issues.asterisk.org/view.php?id=0 (chan_local.c): MUTEX
559 local_hangup &ast->lock
0x9551850 (1)
=== ---> Tried and failed to get Lock https://issues.asterisk.org/view.php?id=1
(chan_local.c): MUTEX 588
local_hangup &p->chan->lock 0x962f250 (0)
Thread 21 (process 12142):
https://issues.asterisk.org/view.php?id=0  0x00366410 in __kernel_vsyscall ()
https://issues.asterisk.org/view.php?id=1  0x007a1996 in nanosleep () from
/lib/libc.so.6
https://issues.asterisk.org/view.php?id=2  0x007db15c in usleep () from
/lib/libc.so.6
https://issues.asterisk.org/view.php?id=3  0x00a03e57 in local_hangup
(ast=0x95517d8) at chan_local.c:589
https://issues.asterisk.org/view.php?id=4  0x080a39b5 in ast_hangup ()
https://issues.asterisk.org/view.php?id=5  0x07d40a7b in try_calling
(qe=0x547dc40, options=0x547dbdf "twr",
announceoverride=0x547dbe4 "", url=0x547dbe3 "", tries=0x547dde0,
noption=0x547dddc, agi=0x547dbe8 "") at app_queue.c:1751
https://issues.asterisk.org/view.php?id=6  0x07d437e9 in queue_exec
(chan=0x9624750, data=0x547fe58) at
app_queue.c:4124
https://issues.asterisk.org/view.php?id=7  0x08113ace in pbx_extension_helper ()
https://issues.asterisk.org/view.php?id=8  0x00c4e4f9 in _macro_exec
(chan=0x9624750, data=0x5484ee8,
exclusive=0) at app_macro.c:352
https://issues.asterisk.org/view.php?id=9  0x08113ace in pbx_extension_helper ()
https://issues.asterisk.org/view.php?id=10 0x08117921 in __ast_pbx_run ()
https://issues.asterisk.org/view.php?id=11 0x08118b4e in pbx_thread ()
https://issues.asterisk.org/view.php?id=12 0x0816297f in dummy_start ()
https://issues.asterisk.org/view.php?id=13 0x008ed5ab in start_thread () from
/lib/libpthread.so.0
https://issues.asterisk.org/view.php?id=14 0x007e1cfe in clone () from
/lib/libc.so.6

=== Thread ID: 52378512 (pbx_thread           started at [ 2643] pbx.c
ast_pbx_start())
=== ---> Lock https://issues.asterisk.org/view.php?id=0 (chan_local.c): MUTEX
201 local_queue_frame &us->lock
0x962f250 (1)
=== ---> Tried and failed to get Lock https://issues.asterisk.org/view.php?id=1
(chan_local.c): MUTEX 195
local_queue_frame &other->lock 0x9551850 (0)
Thread 20 (process 12146):
https://issues.asterisk.org/view.php?id=0  0x00366410 in __kernel_vsyscall ()
https://issues.asterisk.org/view.php?id=1  0x007a1996 in nanosleep () from
/lib/libc.so.6
https://issues.asterisk.org/view.php?id=2  0x007db15c in usleep () from
/lib/libc.so.6
https://issues.asterisk.org/view.php?id=3  0x00a02f7e in local_queue_frame
(p=0x962d2b8, isoutbound=1,
f=0x31e887c, us=0x962f1d8, us_locked=1) at chan_local.c:200
https://issues.asterisk.org/view.php?id=4  0x00a04cf2 in local_indicate
(ast=0x962f1d8, condition=-1, data=0x0,
datalen=0) at chan_local.c:399
https://issues.asterisk.org/view.php?id=5  0x08099762 in ast_indicate_data ()
https://issues.asterisk.org/view.php?id=6  0x0809ad0e in ast_answer ()
https://issues.asterisk.org/view.php?id=7  0x00283a59 in ast_bridge_call
(chan=0x962f1d8, peer=0x9667d58,
config=0x31e9c7c) at res_features.c:1729
https://issues.asterisk.org/view.php?id=8  0x004f697e in dial_exec_full
(chan=0x962f1d8, data=<value optimized
out>, peerflags=0x31e9e14, continue_exec=0x0) at app_dial.c:1889
https://issues.asterisk.org/view.php?id=9  0x004f9342 in dial_exec
(chan=0x962f1d8, data=0x31ebe78) at
app_dial.c:1937
https://issues.asterisk.org/view.php?id=10 0x08113ace in pbx_extension_helper ()
https://issues.asterisk.org/view.php?id=11 0x00c4e4f9 in _macro_exec
(chan=0x962f1d8, data=0x31f0ee8,
exclusive=0) at app_macro.c:352
https://issues.asterisk.org/view.php?id=12 0x08113ace in pbx_extension_helper ()
https://issues.asterisk.org/view.php?id=13 0x08117921 in __ast_pbx_run ()
https://issues.asterisk.org/view.php?id=14 0x08118b4e in pbx_thread ()
https://issues.asterisk.org/view.php?id=15 0x0816297f in dummy_start ()
https://issues.asterisk.org/view.php?id=16 0x008ed5ab in start_thread () from
/lib/libpthread.so.0
https://issues.asterisk.org/view.php?id=17 0x007e1cfe in clone () from
/lib/libc.so.6 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2010-01-13 05:13 skochen        Note Added: 0116563                          
======================================================================




More information about the asterisk-bugs mailing list