<html>
<body>
<div style="font-family: Verdana, Arial, Helvetica, Sans-Serif;">
<table bgcolor="#f9f3c9" width="100%" cellpadding="8" style="border: 1px #c9c399 solid;">
<tr>
<td>
This is an automatically generated e-mail. To reply, visit:
<a href="https://reviewboard.asterisk.org/r/1610/">https://reviewboard.asterisk.org/r/1610/</a>
</td>
</tr>
</table>
<br />
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<p style="margin-top: 0;">On December 6th, 2011, 10:33 a.m., <b>David Vossel</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<table width="100%" border="0" bgcolor="white" style="border: 1px solid #C0C0C0; border-collapse: collapse; margin: 2px padding: 2px;">
<thead>
<tr>
<th colspan="4" bgcolor="#F0F0F0" style="border-bottom: 1px solid #C0C0C0; font-size: 9pt; padding: 4px 8px; text-align: left;">
<a href="https://reviewboard.asterisk.org/r/1610/diff/1/?file=22117#file22117line14289" style="color: black; font-weight: bold; text-decoration: underline;">/branches/1.8/channels/chan_sip.c</a>
<span style="font-weight: normal;">
(Diff revision 1)
</span>
</th>
</tr>
</thead>
<tbody style="background-color: #e4d9cb; padding: 4px 8px; text-align: center;">
<tr>
<td colspan="4"><pre style="font-size: 8pt; line-height: 140%; margin: 0; ">static void mwi_event_cb(const struct ast_event *event, void *userdata)</pre></td>
</tr>
</tbody>
<tbody>
<tr>
<th bgcolor="#f0f0f0" style="border-right: 1px solid #C0C0C0;" align="right"><font size="2">14289</font></th>
<td bgcolor="#ffffff" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td>
<th bgcolor="#f0f0f0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid #C0C0C0;" align="right"><font size="2">14289</font></th>
<td bgcolor="#ffffff" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td>
</tr>
</tbody>
<tbody>
<tr>
<th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" align="right"><font size="2"></font></th>
<td bgcolor="#c5ffc4" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td>
<th bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid #C0C0C0;" align="right"><font size="2">14290</font></th>
<td bgcolor="#c5ffc4" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; ">        <span class="n">ref_peer</span><span class="p">(</span><span class="n">peer</span><span class="p">,</span> <span class="s">"Holding peer while sending MWI in event callback</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span></pre></td>
</tr>
</tbody>
<tbody>
<tr>
<th bgcolor="#f0f0f0" style="border-right: 1px solid #C0C0C0;" align="right"><font size="2">14290</font></th>
<td bgcolor="#ffffff" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; ">        <span class="n">sip_send_mwi_to_peer</span><span class="p">(</span><span class="n">peer</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span></pre></td>
<th bgcolor="#f0f0f0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid #C0C0C0;" align="right"><font size="2">14291</font></th>
<td bgcolor="#ffffff" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; ">        <span class="n">sip_send_mwi_to_peer</span><span class="p">(</span><span class="n">peer</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span></pre></td>
</tr>
</tbody>
<tbody>
<tr>
<th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" align="right"><font size="2"></font></th>
<td bgcolor="#c5ffc4" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td>
<th bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid #C0C0C0;" align="right"><font size="2">14292</font></th>
<td bgcolor="#c5ffc4" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; ">        <span class="n">unref_peer</span><span class="p">(</span><span class="n">peer</span><span class="p">,</span> <span class="s">"Releasing peer after sending MWI in event callback</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span></pre></td>
</tr>
</tbody>
<tbody>
<tr>
<th bgcolor="#f0f0f0" style="border-right: 1px solid #C0C0C0;" align="right"><font size="2">14291</font></th>
<td bgcolor="#ffffff" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "><span class="p">}</span></pre></td>
<th bgcolor="#f0f0f0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid #C0C0C0;" align="right"><font size="2">14293</font></th>
<td bgcolor="#ffffff" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "><span class="p">}</span></pre></td>
</tr>
</tbody>
</table>
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">I know we talked about this briefly, but I still have reservations about this.
Just to be clear for historical reasons. The use of ref counting here does nothing to give the event thread ownership of a reference to the peer. If the event thread does not have a reference handed to it at subscription time, then adding the reference here in the callback will not do anything to prevent the peer from being destroyed as it may have already been destroyed before we even add the ref count.
Also, given that the un-subscription to this event occurs during the peer's ao2 destructor callback, we actually run the risk of adding and removing a reference to the peer while it is in the destructor callback... I really don't know what that will do.
>From what I remember, these lines were added for debugging purposes in order to determine if the peer was already destroyed (we'll get the "bad magic number" error when we try to ref it if it is already destroyed). I'm not sure if this gives us anything new though, as the ao2_lock in sip_send_mwi_to_peer() should offer the same debug information.</pre>
</blockquote>
</blockquote>
<pre style="margin-left: 1em; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">I don't think we run the risk of ref'ing the peer here while it is in the destructor callback (or at least while we're in a potentially very dangerous state). They both compete for the event lock.
1. If the destructor callback wins, then it will unsubscribe from the event prior to mwi_event_cb being called (which is its first action)
2. If the event engine wins, then the destructor callback will block until after mwi_event_cb is completely done.
Your latter statement is correct - this was added to try and catch where the peer went bad on us as soon as possible. I'd also note that we care enough about not having the peer disappear on us when using sip_send_mwi_to_peer in handle_request_subscribe that we ref the peer there (which is also probably unnecessary). Thinking through this some more, this doesn't really add much, and may give a false layer of security that isn't correct - so I agree with blowing it out.
In retrospect, it'd be nice if the event callback didn't have a pointer to the peer in the first place, as that'd keep the ownership semantics a bit clearer :-)</pre>
<br />
<p>- mjordan</p>
<br />
<p>On December 6th, 2011, 10:10 a.m., mjordan wrote:</p>
<table bgcolor="#fefadf" width="100%" cellspacing="0" cellpadding="8" style="background-image: url('https://reviewboard.asterisk.org/media/rb/images/review_request_box_top_bg.png'); background-position: left top; background-repeat: repeat-x; border: 1px black solid;">
<tr>
<td>
<div>Review request for Asterisk Developers, David Vossel and opticron.</div>
<div>By mjordan.</div>
<p style="color: grey;"><i>Updated Dec. 6, 2011, 10:10 a.m.</i></p>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Description </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: 1px solid #b8b5a0">
<tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">ASTERISK-18663 originally manifested as a deadlock when setting 'allowsubscribe=yes', 'callercounter = yes' and setting the subscribecontext in chan_sip. When the deadlock was resolved by r345063, a crash would occur in chan_sip. This would manifest when an MWI notification was to be sent to a peer, but the peer had been deleted due to being dereferenced to a ref count of 0. The root cause of this ended up being the MWI event subscription being resubscribed to in several places, and orphaning the previous event subscription. When an MWI event would occur, all of the event subscriptions (including the orphaned subscriptions) would be notified. This didn't cause any issues until a peer was removed, either by pruning realtime SIP peers, unloading chan_sip, etc. When the peer cleaned itself up, it only removes the subscription that it's aware of - the orphaned subscriptions would continue to exist and, if a new MWI event occurred, would crash Asterisk by referencing the deleted peer.
This patch does several things:
1. It resolves the issue in subscribing to the MWI event callback by first unsubscribing the old event subscription
2. It more aggressively holds the authpeer in handle_request_subscribe and removes some unneeded peer ref'ing / deref'ing. This was done more for clarity, as the previous location of deref'ing the authpeer ignored that the relatedpeer, set to the authpeer, was still used later in the method
3. It fixes a potential bug wherein an authentication result could be positive, but all failures are assumed to be negative</pre>
</td>
</tr>
</table>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Testing </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: 1px solid #b8b5a0">
<tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Testing was done extensively using 1.8 and 1.8.8.0-rc4. This included using two SIP phones with BLF and MWI subscriptions, with multiple mailboxes defined for various extensions, and module unloading / reloading chan_sip at various times (both before SUBSCRIBE messages were received and after multiple SUBSCRIBE messages had been recevied). The patch was also confirmed to resolve the issue by the issue reporter.</pre>
</td>
</tr>
</table>
<div style="margin-top: 1.5em;">
<b style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Bugs: </b>
<a href="https://issues.asterisk.org/jira/browse/ASTERISK-18663">ASTERISK-18663</a>
</div>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Diffs</b> </h1>
<ul style="margin-left: 3em; padding-left: 0;">
<li>/branches/1.8/channels/chan_sip.c <span style="color: grey">(347057)</span></li>
<li>/branches/1.8/channels/sip/include/sip.h <span style="color: grey">(347057)</span></li>
</ul>
<p><a href="https://reviewboard.asterisk.org/r/1610/diff/" style="margin-left: 3em;">View Diff</a></p>
</td>
</tr>
</table>
</div>
</body>
</html>