[asterisk-bugs] [Asterisk 0015787]: [patch] chan_local deadlock

Asterisk Bug Tracker noreply at bugs.digium.com
Mon Aug 31 10:45:01 CDT 2009


A NOTE has been added to this issue. 
====================================================================== 
https://issues.asterisk.org/view.php?id=15787 
====================================================================== 
Reported By:                tim_ringenbach
Assigned To:                tilghman
====================================================================== 
Project:                    Asterisk
Issue ID:                   15787
Category:                   Channels/chan_local
Reproducibility:            random
Severity:                   minor
Priority:                   normal
Status:                     ready for review
Asterisk Version:           1.4.26.1 
Regression:                 No 
SVN Branch (only for SVN checkouts, not tarball releases): N/A 
SVN Revision (number only!):  
Request Review:              
====================================================================== 
Date Submitted:             2009-08-27 19:03 CDT
Last Modified:              2009-08-31 10:45 CDT
====================================================================== 
Summary:                    [patch] chan_local deadlock
Description: 
I've experienced a chan_local deadlock that wasn't in 1.4.24 but is in
1.4.25.1 and 1.4.26.1. I see threads stuck like this:

  28 process 7567  0x00007fb9de31a174 in __lll_lock_wait () from
/lib/libpthread.so.0
  27 process 7568  0x00007fb9de31a174 in __lll_lock_wait () from
/lib/libpthread.so.0
  26 process 7569  0x00007fb9de31a174 in __lll_lock_wait () from
/lib/libpthread.so.0
  25 process 7570  0x00007fb9de31a174 in __lll_lock_wait () from
/lib/libpthread.so.0
  24 process 7571  0x00007fb9de31a174 in __lll_lock_wait () from
/lib/libpthread.so.0


And this backtrace from them:
https://issues.asterisk.org/view.php?id=0  0x00007fb9de31a174 in __lll_lock_wait
() from /lib/libpthread.so.0
https://issues.asterisk.org/view.php?id=1  0x00007fb9de315b23 in _L_lock_261 ()
from /lib/libpthread.so.0
https://issues.asterisk.org/view.php?id=2  0x00007fb9de3154e8 in
pthread_mutex_lock () from /lib/libpthread.so.0
https://issues.asterisk.org/view.php?id=3  0x00007fb9cfe1840b in local_write
(ast=0x7fb9d0019cf0, f=0x80) at
chan_local.c:340
https://issues.asterisk.org/view.php?id=4  0x00007fb9de31f3d1 in ?? () from
/lib/libpthread.so.0
https://issues.asterisk.org/view.php?id=5  0x0000000000000001 in ?? ()
https://issues.asterisk.org/view.php?id=6  0x000000004342d1e0 in ?? ()
https://issues.asterisk.org/view.php?id=7  0x000000004342ce60 in ?? ()
https://issues.asterisk.org/view.php?id=8  0x0000000000000001 in ?? ()
https://issues.asterisk.org/view.php?id=9  0x00007fb9dd90aee3 in siglongjmp ()
from /lib/libc.so.6
https://issues.asterisk.org/view.php?id=10 0x00007fb9de319ee8 in unwind_stop ()
from /lib/libpthread.so.0
https://issues.asterisk.org/view.php?id=11 0x00000000004388a9 in ast_request
(type=0x7fb9c80b9cf0 "Local",
format=64, data=0x7fb9c80b9d40, cause=0x4342cd18) at channel.c:3381

(manually killed with sig 11 to ge trace, it never crashed on its own)

And I think I've found the problem, which I'll attach as a patch.
====================================================================== 

---------------------------------------------------------------------- 
 (0109854) tim_ringenbach (reporter) - 2009-08-31 10:45
 https://issues.asterisk.org/view.php?id=15787#c109854 
---------------------------------------------------------------------- 
After further testing, this did not fix the deadlocks I was having. But, I
think the patch is still valid, I'm just not hitting that potential
deadlock.

What really fixed my problem was reverting r190286. I have a patch to
chan_local that expands on the device state tracking, such that if any
local channels are up for that extension, the device returns in use. I was
deadlocking on the list lock because r190286 breaks the locking order.

So this bug might be invalid now since the deadlock only happens in my
patched version. 

But you are violating the locking order. Other parts of the code seem to
rely on that locking order too. The unload_module() function and the
locals_show() function look to be the only places that assume that locking
order, so the code might deadlock on either of those. 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2009-08-31 10:45 tim_ringenbach Note Added: 0109854                          
======================================================================




More information about the asterisk-bugs mailing list