[asterisk-bugs] [Asterisk 0018690]: sip deadlock with explanation

Asterisk Bug Tracker noreply at bugs.digium.com
Tue Feb 15 02:07:55 CST 2011


A NOTE has been added to this issue. 
====================================================================== 
https://issues.asterisk.org/view.php?id=18690 
====================================================================== 
Reported By:                dvossel
Assigned To:                
====================================================================== 
Project:                    Asterisk
Issue ID:                   18690
Category:                   Channels/chan_sip/General
Reproducibility:            always
Severity:                   major
Priority:                   normal
Status:                     acknowledged
Asterisk Version:           SVN 
JIRA:                       SWP-3020 
Regression:                 No 
Reviewboard Link:            
SVN Branch (only for SVN checkouts, not tarball releases): N/A 
SVN Revision (number only!):  
Request Review:              
====================================================================== 
Date Submitted:             2011-01-27 12:44 CST
Last Modified:              2011-02-15 02:07 CST
====================================================================== 
Summary:                    sip deadlock with explanation
Description: 
I've been doing quite a bit of load testing of chan_sip and chan_iax for
the media architecture changes.  When I push the calls per a second up past
a certain limit I am seeing a consistent deadlock occurring.  I've
investigated it and here's what is happening.

In chan_sip's handle_request_do() function we have to lock both the pvt
and the channel at the same time.  This involves deadlock avoidance and is
a mess.  After a number of locking attempts fail on the channel, the
request is queued to be handled at a later time.  This would typically only
occur under very heavy load.

Queued requests are also handled in the handle_request_do() function. 
Once both a pvt and a channel are locked, the queued requests are handled
first, and then the actual request triggering the handle_request_do
function's invocation is processed... This is where the problem is...

Between process_request_queue() and calling handle_incoming() the channel
may be unlocked.  We can detect this by inspecting the nounlock variable
which is passed to process_request_queue() but we do not.  If the channel
remains unlocked entering handle_incoming(), it is very possible a deadlock
will occur since many of the code paths in handle_incoming will try to grab
a recursive channel lock using the channel API.  In the core show deadlocks
output I have included ast_queue_frame causes this to occur.


I have a setup that reproduces this problem consistently.

====================================================================== 

---------------------------------------------------------------------- 
 (0131953) thsgmbh (reporter) - 2011-02-15 02:07
 https://issues.asterisk.org/view.php?id=18690#c131953 
---------------------------------------------------------------------- 
Hi, I tried your patch from the reviewboard (adjusted for asterisk 1.6.2)
and still get deadlocks in handle_request_do (in association with
find_call)...

Not sure, if my deadlock is really related to your bug/patch (but it seems
to be so)!?!

I have attached a "core show locks"... 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2011-02-15 02:07 thsgmbh        Note Added: 0131953                          
======================================================================




More information about the asterisk-bugs mailing list