[asterisk-dev] [Code Review] [patch] Use of MASTER_CHANNEL causes a race condition ending in a deadlock among channels

Russell Bryant reviewboard at asterisk.org
Sat Feb 26 07:27:06 CST 2011



> On 2011-02-25 14:43:38, Tilghman Lesher wrote:
> > Could you be a little more explicit in what channel is being "potentially" locked here?  The code, as it exists today, explicitly unlocks the SIP channel and the pvt before engaging in the use of the MASTER_CHANNEL() construct, keeping merely a reference to the channel.
> > 
> > If the problem is within the MASTER_CHANNEL function, shouldn't this change fix that function as well?
> 
> Russell Bryant wrote:
>     It's not potentially locked, it is always locked.  The code in chan_sip unlocks the channel, but then pbx_builtin_setvar_helper() locks it again.  The problem is that this channel lock is held while MASTER_CHANNEL() runs.  No channels can be held at all when acquiring the channels container lock.
>     
>     I worked with Brett on this.  We couldn't come up with any reasonably way to solve this in MASTER_CHANNEL().  In fact, once this patch went in, I was going to propose that MASTER_CHANNEL() be removed completely unless someone can come up with a reasonable solution to this potential deadlock.  The only thing I could come up with is if the real work of MASTER_CHANNEL() was deferred to another thread.  However, doing a setvar operation asynchronously and not having any idea when it actually succeeds seems pretty bad (not to mention the performance impact of such a solution).
> 
> Tilghman Lesher wrote:
>     I hate to disagree with you, but pbx_builtin_setvar_helper() only relocks the channel when the key being set is a variable.  When setting dialplan functions (as in MASTER_CHANNEL), the channel is not locked (see the very first "if" conditional in pbx_builtin_setvar_helper).  So the only time that a channel is locked is at the final stage, when the HASH() function is being set (and even then, only _within_ the HASH() function).
>     
>     I think your diagnosis needs some revision, because as it stands, the explanation given does not match the existing code.

Sure enough.  You're right.  Score another for peer review!

Debugging deadlocks with just a backtrace and not "core show locks" sure is fun ......


- Russell


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/1127/#review3220
-----------------------------------------------------------


On 2011-02-25 13:40:19, Brett Bryant wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/1127/
> -----------------------------------------------------------
> 
> (Updated 2011-02-25 13:40:19)
> 
> 
> Review request for Asterisk Developers.
> 
> 
> Summary
> -------
> 
> Use of MASTER_CHANNEL in chan_sip causes pbx_setvar_helper to potentially lock a channel before iterating through the channel container to find the master channel. This violates locking order, and when two threads are doing this at the same time they can both end up in a deadlock holding the channel locks that the others are trying to acquire through the iterator.
> 
> 
> Diffs
> -----
> 
>   /tags/1.8.3-rc2/channels/chan_sip.c 308543 
> 
> Diff: https://reviewboard.asterisk.org/r/1127/diff
> 
> 
> Testing
> -------
> 
> Tested functionality of code and accuracy before and after change which removed the use of the MASTER_CHANNEL function.
> 
> 
> Thanks,
> 
> Brett
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20110226/6c3108b7/attachment-0001.htm>


More information about the asterisk-dev mailing list