[asterisk-bugs] [Asterisk 0018690]: sip deadlock with explanation

Asterisk Bug Tracker noreply at bugs.digium.com
Mon Apr 18 10:23:47 CDT 2011


A NOTE has been added to this issue. 
====================================================================== 
https://issues.asterisk.org/view.php?id=18690 
====================================================================== 
Reported By:                dvossel
Assigned To:                
====================================================================== 
Project:                    Asterisk
Issue ID:                   18690
Category:                   Channels/chan_sip/General
Reproducibility:            always
Severity:                   major
Priority:                   normal
Status:                     acknowledged
Asterisk Version:           SVN 
JIRA:                       SWP-3020 
Regression:                 No 
Reviewboard Link:            
SVN Branch (only for SVN checkouts, not tarball releases): N/A 
SVN Revision (number only!):  
Request Review:              
====================================================================== 
Date Submitted:             2011-01-27 12:44 CST
Last Modified:              2011-04-18 10:23 CDT
====================================================================== 
Summary:                    sip deadlock with explanation
Description: 
I've been doing quite a bit of load testing of chan_sip and chan_iax for
the media architecture changes.  When I push the calls per a second up past
a certain limit I am seeing a consistent deadlock occurring.  I've
investigated it and here's what is happening.

In chan_sip's handle_request_do() function we have to lock both the pvt
and the channel at the same time.  This involves deadlock avoidance and is
a mess.  After a number of locking attempts fail on the channel, the
request is queued to be handled at a later time.  This would typically only
occur under very heavy load.

Queued requests are also handled in the handle_request_do() function. 
Once both a pvt and a channel are locked, the queued requests are handled
first, and then the actual request triggering the handle_request_do
function's invocation is processed... This is where the problem is...

Between process_request_queue() and calling handle_incoming() the channel
may be unlocked.  We can detect this by inspecting the nounlock variable
which is passed to process_request_queue() but we do not.  If the channel
remains unlocked entering handle_incoming(), it is very possible a deadlock
will occur since many of the code paths in handle_incoming will try to grab
a recursive channel lock using the channel API.  In the core show deadlocks
output I have included ast_queue_frame causes this to occur.


I have a setup that reproduces this problem consistently.

====================================================================== 

---------------------------------------------------------------------- 
 (0133869) svnbot (reporter) - 2011-04-18 10:23
 https://issues.asterisk.org/view.php?id=18690#c133869 
---------------------------------------------------------------------- 
Repository: asterisk
Revision: 314067

U   branches/1.8/channels/chan_sip.c

------------------------------------------------------------------------
r314067 | dvossel | 2011-04-18 10:23:46 -0500 (Mon, 18 Apr 2011) | 22
lines

Remove the need for deadlock avoidance in chan_sip do_monitor.

Deadlock avoidance between the sip pvt and the pvt->owner is
very difficult.  Now that channel's are ao2 objects, this complication
is no longer necessary.  It turns out the pvt's msg queue only
exists because of deadlock avoidance (when deadlock avoidance fails
msgs were added to a queue to be processed later), so this goes away as
well.

The technique used in the new sip_lock_pvt_full() function should
be used as a template for replacing all locations where deadlock
avoidance occurs between a channel tech_pvt and the pvt's owner.
My hope is that this will begin a reversal of the invalid channel
driver locking architecture we have been using for so long. 

This patch also resolves an issue where the pvt->owner gets
unlocked during processing the msg queue.

(closes issue https://issues.asterisk.org/view.php?id=18690)
Reported by: dvossel

Review: https://reviewboard.asterisk.org/r/1182/

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=314067 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2011-04-18 10:23 svnbot         Note Added: 0133869                          
======================================================================




More information about the asterisk-bugs mailing list