[asterisk-bugs] [Asterisk 0018357]: Deadlock on channel list during masquerade and channel search.

Asterisk Bug Tracker noreply at bugs.digium.com
Thu Dec 23 14:20:47 UTC 2010


A NOTE has been added to this issue. 
====================================================================== 
https://issues.asterisk.org/view.php?id=18357 
====================================================================== 
Reported By:                chuckberry
Assigned To:                
====================================================================== 
Project:                    Asterisk
Issue ID:                   18357
Category:                   Core/Channels
Reproducibility:            random
Severity:                   minor
Priority:                   normal
Status:                     feedback
Asterisk Version:           1.6.2.14 
JIRA:                       SWP-2644 
Regression:                 No 
Reviewboard Link:            
SVN Branch (only for SVN checkouts, not tarball releases): N/A 
SVN Revision (number only!):  
Request Review:              
====================================================================== 
Date Submitted:             2010-11-23 10:02 CST
Last Modified:              2010-12-23 08:20 CST
====================================================================== 
Summary:                    Deadlock on channel list during masquerade and
channel search.
Description: 
Found a deadlock while handling Local channel masquerades and the Pickup
application.

Follows the stack trace and a brief explanation:

(gdb) thread 2
[Switching to thread 2 (Thread 0xa56ffb90 (LWP
15220))]https://issues.asterisk.org/view.php?id=0  0xb7fc0410 in
__kernel_vsyscall ()
(gdb) bt
https://issues.asterisk.org/view.php?id=0  0xb7fc0410 in __kernel_vsyscall ()
https://issues.asterisk.org/view.php?id=1  0xb7e52589 in __lll_lock_wait () from
/lib/tls/i686/cmov/libpthread.so.0
https://issues.asterisk.org/view.php?id=2  0xb7e4dbb4 in _L_lock_236 () from
/lib/tls/i686/cmov/libpthread.so.0
https://issues.asterisk.org/view.php?id=3  0xb7e4d60b in pthread_mutex_lock ()
from
/lib/tls/i686/cmov/libpthread.so.0
https://issues.asterisk.org/view.php?id=4  0xb7f46ea6 in pthread_setschedparam
() from
/lib/tls/i686/cmov/libc.so.6
https://issues.asterisk.org/view.php?id=5  0x08095104 in
ast_channel_search_locked (is_match=0xb7468380
<find_by_mark>, data=0xa56face0) at
/root/ASTERISK/asterisk-1.6.2.13/include/asterisk/lock.h:1746
https://issues.asterisk.org/view.php?id=6  0xb7468043 in pickup_exec
(chan=0xa61fc088, data=0xa56fadd9) at
app_directed_pickup.c:222
https://issues.asterisk.org/view.php?id=7  0x08100df9 in pbx_exec (c=0xa61fc088,
app=0xb7b6e808, data=0xa56fadd9)
at pbx.c:1352
https://issues.asterisk.org/view.php?id=8  0xb7899fbf in execif_exec
(chan=0xa61fc088, data=0xa56fd148) at
app_exec.c:257
https://issues.asterisk.org/view.php?id=9  0x08100df9 in pbx_exec (c=0xa61fc088,
app=0x8881028, data=0xa56fd148)
at pbx.c:1352
https://issues.asterisk.org/view.php?id=10 0x0810d170 in pbx_extension_helper
(c=0xa61fc088, con=0x0,
context=0xa61fc2fc "entrada-ramais", exten=0xa61fc34c "55", priority=3,
label=0x0, callerid=0xa6178978 "2903", 
    action=E_SPAWN, found=0xa56ff254, combined_find_spawn=1) at pbx.c:3715
https://issues.asterisk.org/view.php?id=11 0x0810f6e6 in __ast_pbx_run
(c=0xa61fc088, args=0x0) at pbx.c:4186
https://issues.asterisk.org/view.php?id=12 0x08110f80 in pbx_thread
(data=0xa61fc088) at pbx.c:4567
https://issues.asterisk.org/view.php?id=13 0x08151a5b in dummy_start
(data=0xa5c56a60) at utils.c:968
https://issues.asterisk.org/view.php?id=14 0xb7e4b4fb in start_thread () from
/lib/tls/i686/cmov/libpthread.so.0
https://issues.asterisk.org/view.php?id=15 0xb7f39e5e in __call_fallocate ()
from /lib/tls/i686/cmov/libc.so.6
https://issues.asterisk.org/view.php?id=16 0x00000000 in ?? ()

(gdb) thread 3
[Switching to thread 3 (Thread 0xa58f3b90 (LWP
15213))]https://issues.asterisk.org/view.php?id=0  0xb7fc0410 in
__kernel_vsyscall ()
(gdb) bt
https://issues.asterisk.org/view.php?id=0  0xb7fc0410 in __kernel_vsyscall ()
https://issues.asterisk.org/view.php?id=1  0xb7e4f21b in pthread_rwlock_wrlock
() from
/lib/tls/i686/cmov/libpthread.so.0
https://issues.asterisk.org/view.php?id=2  0x08095899 in ast_channel_free
(chan=0xb7e4f21b) at
/root/ASTERISK/asterisk-1.6.2.13/include/asterisk/lock.h:1856
https://issues.asterisk.org/view.php?id=3  0x0809915f in ast_do_masquerade
(original=0xa61028b8) at
channel.c:4869
https://issues.asterisk.org/view.php?id=4  0x08099761 in ast_waitfor_nandfds
(c=0xa58e8b50, n=2, fds=0x0, nfds=0,
exception=0x0, outfd=0x0, ms=0xa58e9024) at channel.c:2129
https://issues.asterisk.org/view.php?id=5  0x0809a07f in ast_waitfor_n
(c=0xa58e8b50, n=2, ms=0xa58e9024) at
channel.c:2448
https://issues.asterisk.org/view.php?id=6  0xb73b549c in dial_exec_full
(chan=0x88ecd70, data=<value optimized
out>, peerflags=0xa58e9900, continue_exec=0x0) at app_dial.c:879
https://issues.asterisk.org/view.php?id=7  0xb73b9ff9 in dial_exec
(chan=0x88ecd70, data=0xa58e99d8) at
app_dial.c:2353
https://issues.asterisk.org/view.php?id=8  0x08100df9 in pbx_exec (c=0x88ecd70,
app=0xa6599258, data=0xa58e99d8)
at pbx.c:1352
https://issues.asterisk.org/view.php?id=9  0xb7899bf4 in tryexec_exec
(chan=0x88ecd70, data=0xa58ec0b8) at
app_exec.c:184
https://issues.asterisk.org/view.php?id=10 0x08100df9 in pbx_exec (c=0x88ecd70,
app=0x8880fe8, data=0xa58ec0b8)
at pbx.c:1352
https://issues.asterisk.org/view.php?id=11 0x0810d170 in pbx_extension_helper
(c=0x88ecd70, con=0x0,
context=0x88ecfe4 "physical-dial", exten=0x88ed034 "2900", priority=4,
label=0x0, callerid=0x88df0a8 "00100000000", 
    action=E_SPAWN, found=0xa58ee3b8, combined_find_spawn=1) at pbx.c:3715
https://issues.asterisk.org/view.php?id=12 0xb78c1518 in _macro_exec
(chan=0x88ecd70, data=<value optimized out>,
exclusive=0) at app_macro.c:398
https://issues.asterisk.org/view.php?id=13 0x08100df9 in pbx_exec (c=0x88ecd70,
app=0xb7b3ac18, data=0xa58f1148)
at pbx.c:1352
https://issues.asterisk.org/view.php?id=14 0x0810d170 in pbx_extension_helper
(c=0x88ecd70, con=0x0,
context=0x88ecfe4 "physical-dial", exten=0x88ed034 "2900", priority=2,
label=0x0, callerid=0x88df0a8 "00100000000", 
    action=E_SPAWN, found=0xa58f3254, combined_find_spawn=1) at pbx.c:3715
https://issues.asterisk.org/view.php?id=15 0x0810f6e6 in __ast_pbx_run
(c=0x88ecd70, args=0x0) at pbx.c:4186
https://issues.asterisk.org/view.php?id=16 0x08110f80 in pbx_thread
(data=0x88ecd70) at pbx.c:4567
https://issues.asterisk.org/view.php?id=17 0x08151a5b in dummy_start
(data=0x88d5dc0) at utils.c:968
https://issues.asterisk.org/view.php?id=18 0xb7e4b4fb in start_thread () from
/lib/tls/i686/cmov/libpthread.so.0
https://issues.asterisk.org/view.php?id=19 0xb7f39e5e in __call_fallocate ()
from /lib/tls/i686/cmov/libc.so.6
https://issues.asterisk.org/view.php?id=20 0x00000000 in ?? ()

The execution stack above shows this scenario:

* thread 3 is called with channel X lock already held.
* thread 2 is called, and helds channel list lock while iterating over
channels.
* thread 3 tries to lock channel list, but thread 2 already owns it.
* thread 2 tries to lock channel X while iterating, but thread 3 already
owns it.

We are now in a classic deadlock situation. Channel list receives new
wrlock requests but they never succed, so new calls get not processed by
the PBX.

One solution is to change the search to a try-lock approach, where the
list and the channels are try-locked, and everything is unlocked if some
channel cannot be locked in a reasonable time.

NOTE: This bug was detected on 1.6.2.13 - diff'ing both 1.6.2.13 and
1.6.2.14 releaved no changes in the affected areas, so I am reporting this
case for 1.6.2.14 as 1.6.2.13 is missing from checkboxes.

====================================================================== 

---------------------------------------------------------------------- 
 (0129917) chuckberry (reporter) - 2010-12-23 08:20
 https://issues.asterisk.org/view.php?id=18357#c129917 
---------------------------------------------------------------------- 
lmadsen, thanks for the reply.

As the problem was happening on our main PBX, which needs to be reliable,
we upgraded it to version 1.8. Since this version already has the astobj2
container for the channel list, this should make it less prone to
deadlocks.

I'm sorry, but for now I cannot go back to 1.6.2 series on that machine.

I will try to include 1.6.2.15 in some internal stress/general tests, and
report any misbehaviour. 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2010-12-23 08:20 chuckberry     Note Added: 0129917                          
======================================================================




More information about the asterisk-bugs mailing list