<html>
<body>
<div style="font-family: Verdana, Arial, Helvetica, Sans-Serif;">
<table bgcolor="#f9f3c9" width="100%" cellpadding="8" style="border: 1px #c9c399 solid;">
<tr>
<td>
This is an automatically generated e-mail. To reply, visit:
<a href="https://reviewboard.asterisk.org/r/1127/">https://reviewboard.asterisk.org/r/1127/</a>
</td>
</tr>
</table>
<br />
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<p style="margin-top: 0;">On February 25th, 2011, 2:43 p.m., <b>Tilghman Lesher</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Could you be a little more explicit in what channel is being "potentially" locked here? The code, as it exists today, explicitly unlocks the SIP channel and the pvt before engaging in the use of the MASTER_CHANNEL() construct, keeping merely a reference to the channel.
If the problem is within the MASTER_CHANNEL function, shouldn't this change fix that function as well?</pre>
</blockquote>
<p>On February 25th, 2011, 4:49 p.m., <b>Russell Bryant</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">It's not potentially locked, it is always locked. The code in chan_sip unlocks the channel, but then pbx_builtin_setvar_helper() locks it again. The problem is that this channel lock is held while MASTER_CHANNEL() runs. No channels can be held at all when acquiring the channels container lock.
I worked with Brett on this. We couldn't come up with any reasonably way to solve this in MASTER_CHANNEL(). In fact, once this patch went in, I was going to propose that MASTER_CHANNEL() be removed completely unless someone can come up with a reasonable solution to this potential deadlock. The only thing I could come up with is if the real work of MASTER_CHANNEL() was deferred to another thread. However, doing a setvar operation asynchronously and not having any idea when it actually succeeds seems pretty bad (not to mention the performance impact of such a solution).</pre>
</blockquote>
<p>On February 26th, 2011, 2:35 a.m., <b>Tilghman Lesher</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">I hate to disagree with you, but pbx_builtin_setvar_helper() only relocks the channel when the key being set is a variable. When setting dialplan functions (as in MASTER_CHANNEL), the channel is not locked (see the very first "if" conditional in pbx_builtin_setvar_helper). So the only time that a channel is locked is at the final stage, when the HASH() function is being set (and even then, only _within_ the HASH() function).
I think your diagnosis needs some revision, because as it stands, the explanation given does not match the existing code.</pre>
</blockquote>
</blockquote>
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Sure enough. You're right. Score another for peer review!
Debugging deadlocks with just a backtrace and not "core show locks" sure is fun ......</pre>
<br />
<p>- Russell</p>
<br />
<p>On February 25th, 2011, 1:40 p.m., Brett Bryant wrote:</p>
<table bgcolor="#fefadf" width="100%" cellspacing="0" cellpadding="8" style="background-image: url('https://reviewboard.asterisk.orgrb/images/review_request_box_top_bg.png'); background-position: left top; background-repeat: repeat-x; border: 1px black solid;">
<tr>
<td>
<div>Review request for Asterisk Developers.</div>
<div>By Brett Bryant.</div>
<p style="color: grey;"><i>Updated 2011-02-25 13:40:19</i></p>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Description </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: 1px solid #b8b5a0">
<tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Use of MASTER_CHANNEL in chan_sip causes pbx_setvar_helper to potentially lock a channel before iterating through the channel container to find the master channel. This violates locking order, and when two threads are doing this at the same time they can both end up in a deadlock holding the channel locks that the others are trying to acquire through the iterator.</pre>
</td>
</tr>
</table>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Testing </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: 1px solid #b8b5a0">
<tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Tested functionality of code and accuracy before and after change which removed the use of the MASTER_CHANNEL function.</pre>
</td>
</tr>
</table>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Diffs</b> </h1>
<ul style="margin-left: 3em; padding-left: 0;">
<li>/tags/1.8.3-rc2/channels/chan_sip.c <span style="color: grey">(308543)</span></li>
</ul>
<p><a href="https://reviewboard.asterisk.org/r/1127/diff/" style="margin-left: 3em;">View Diff</a></p>
</td>
</tr>
</table>
</div>
</body>
</html>