<html>
<body>
<div style="font-family: Verdana, Arial, Helvetica, Sans-Serif;">
<table bgcolor="#f9f3c9" width="100%" cellpadding="8" style="border: 1px #c9c399 solid;">
<tr>
<td>
This is an automatically generated e-mail. To reply, visit:
<a href="https://reviewboard.asterisk.org/r/1444/">https://reviewboard.asterisk.org/r/1444/</a>
</td>
</tr>
</table>
<br />
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<p style="margin-top: 0;">On September 19th, 2011, 12:01 a.m., <b>David Vossel</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<table width="100%" border="0" bgcolor="white" style="border: 1px solid #C0C0C0; border-collapse: collapse; margin: 2px padding: 2px;">
<thead>
<tr>
<th colspan="4" bgcolor="#F0F0F0" style="border-bottom: 1px solid #C0C0C0; font-size: 9pt; padding: 4px 8px; text-align: left;">
<a href="https://reviewboard.asterisk.org/r/1444/diff/1/?file=20703#file20703line1052" style="color: black; font-weight: bold; text-decoration: underline;">/branches/1.8/res/res_rtp_asterisk.c</a>
<span style="font-weight: normal;">
(Diff revision 1)
</span>
</th>
</tr>
</thead>
<tbody style="background-color: #e4d9cb; padding: 4px 8px; text-align: center;">
<tr>
<td colspan="4"><pre style="font-size: 8pt; line-height: 140%; margin: 0; ">static int ast_rtcp_write_sr(struct ast_rtp_instance *instance)</pre></td>
</tr>
</tbody>
<tbody>
<tr>
<th bgcolor="#f0f0f0" style="border-right: 1px solid #C0C0C0;" align="right"><font size="2">1046</font></th>
<td bgcolor="#ffffff" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td>
<th bgcolor="#f0f0f0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid #C0C0C0;" align="right"><font size="2">1049</font></th>
<td bgcolor="#ffffff" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td>
</tr>
</tbody>
<tbody>
<tr>
<th bgcolor="#e9eaa8" style="border-right: 1px solid #C0C0C0;" align="right"><font size="2">1047</font></th>
<td bgcolor="#fdfebc" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; ">        if (!rtp || !rtp->rtcp<span class="hl">)</span></pre></td>
<th bgcolor="#e9eaa8" style="border-left: 1px solid #C0C0C0; border-right: 1px solid #C0C0C0;" align="right"><font size="2">1050</font></th>
<td bgcolor="#fdfebc" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; ">        if (!rtp || !rtp->rtcp<span class="hl"> || rtp->rtcp->schedid == -1) {</span></pre></td>
</tr>
</tbody>
<tbody>
<tr>
<th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" align="right"><font size="2"></font></th>
<td bgcolor="#c5ffc4" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td>
<th bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid #C0C0C0;" align="right"><font size="2">1051</font></th>
<td bgcolor="#c5ffc4" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; ">                ao2_ref(instance, -1);</pre></td>
</tr>
</tbody>
<tbody>
<tr>
<th bgcolor="#f0f0f0" style="border-right: 1px solid #C0C0C0;" align="right"><font size="2">1048</font></th>
<td bgcolor="#ffffff" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; ">                return 0;</pre></td>
<th bgcolor="#f0f0f0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid #C0C0C0;" align="right"><font size="2">1052</font></th>
<td bgcolor="#ffffff" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; ">                return 0;</pre></td>
</tr>
</tbody>
<tbody>
<tr>
<th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" align="right"><font size="2"></font></th>
<td bgcolor="#c5ffc4" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td>
<th bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid #C0C0C0;" align="right"><font size="2">1053</font></th>
<td bgcolor="#c5ffc4" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; ">        }</pre></td>
</tr>
</tbody>
<tbody>
<tr>
<th bgcolor="#f0f0f0" style="border-right: 1px solid #C0C0C0;" align="right"><font size="2">1049</font></th>
<td bgcolor="#ffffff" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td>
<th bgcolor="#f0f0f0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid #C0C0C0;" align="right"><font size="2">1054</font></th>
<td bgcolor="#ffffff" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td>
</tr>
</tbody>
</table>
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Why are we checking schedid == -1 here. This implies that the thread removing the scheduled item is not the same as the thread executing the scheduled event. If this is the case, how are we guaranteed schedid won't get set to -1 after this check occurs while still in the callback? Would this cause a ref leak or anything?</pre>
</blockquote>
</blockquote>
<pre style="margin-left: 1em; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Excellent observations. You are correct that it is not guaranteed that the thread canceling the scheduled item is not the same one executing the scheduler. Specifically, the SIP monitor thread is executing the scheduler and it is usually a channel thread canceling the scheduled item. (sip_hangup -> stop_media_flows -> ast_rtp_instance_stop).
There will not be a ref leak. The reference is only released when either the scheduled item is successfully canceled or if the scheduler callback is returning 0 indicating that it is not going to run again.
The other question, how are we guaranteed schedid won't get set to -1 after the check occurs, is more complicated. The direct answer is we're not guaranteed. The next question would be, is it harmful that there is no synchronization here? I went back and forth with myself a bit on whether or not locking should be added here. I obviously didn't want to add it unless it was absolutely necessary and ended up going without it.
If schedid gets set to -1 (because the RTP instance got stopped) while this function is running, it's just going to do its thing like usual and transmit one last RTCP packet. That doesn't seem harmful. It's just going to catch the -1 the next time it runs and then bail out and unref the RTP instance. So, it all works out in the end. It's a little less pretty, but at least no more locks get added.
As I wrote this out, the one down side I thought of is that if this edge case gets hit, it would delay the destruction of the RTP instance until the next RTCP packet is scheduled to be sent. By default, this is every 5 seconds, but it's configurable. I'm ok with that personally ...
Now, for the lulz, here is some code in the handling of the rtcpinterval configuraiton option in res_rtp_asterisk.c:
if (rtcpinterval == 0)
rtcpinterval = 0; /* Just so we're clear... it's zero */
</pre>
<br />
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<p style="margin-top: 0;">On September 19th, 2011, 12:01 a.m., <b>David Vossel</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<table width="100%" border="0" bgcolor="white" style="border: 1px solid #C0C0C0; border-collapse: collapse; margin: 2px padding: 2px;">
<thead>
<tr>
<th colspan="4" bgcolor="#F0F0F0" style="border-bottom: 1px solid #C0C0C0; font-size: 9pt; padding: 4px 8px; text-align: left;">
<a href="https://reviewboard.asterisk.org/r/1444/diff/1/?file=20703#file20703line2621" style="color: black; font-weight: bold; text-decoration: underline;">/branches/1.8/res/res_rtp_asterisk.c</a>
<span style="font-weight: normal;">
(Diff revision 1)
</span>
</th>
</tr>
</thead>
<tbody style="background-color: #e4d9cb; padding: 4px 8px; text-align: center;">
<tr>
<td colspan="4"><pre style="font-size: 8pt; line-height: 140%; margin: 0; ">static void ast_rtp_stun_request(struct ast_rtp_instance *instance, struct ast_sockaddr *suggestion, const char *username)</pre></td>
</tr>
</tbody>
<tbody>
<tr>
<th bgcolor="#e9eaa8" style="border-right: 1px solid #C0C0C0;" align="right"><font size="2">2594</font></th>
<td bgcolor="#fdfebc" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; ">        if (rtp->rtcp) {</pre></td>
<th bgcolor="#e9eaa8" style="border-left: 1px solid #C0C0C0; border-right: 1px solid #C0C0C0;" align="right"><font size="2">2618</font></th>
<td bgcolor="#fdfebc" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; ">        if (rtp->rtcp && rtp->rtcp->schedid > 0) {</pre></td>
</tr>
<tr>
<th bgcolor="#e9eaa8" style="border-right: 1px solid #C0C0C0;" align="right"><font size="2">2595</font></th>
<td bgcolor="#fdfebc" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "><span class="hl">                AST_SCHED_DEL</span>(rtp->sched, rtp->rtcp->schedid)<span class="hl">;</span></pre></td>
<th bgcolor="#e9eaa8" style="border-left: 1px solid #C0C0C0; border-right: 1px solid #C0C0C0;" align="right"><font size="2">2619</font></th>
<td bgcolor="#fdfebc" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "><span class="hl">                if (!ast_sched_del</span>(rtp->sched, rtp->rtcp->schedid)<span class="hl">) {</span></pre></td>
</tr>
</tbody>
<tbody>
<tr>
<th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" align="right"><font size="2"></font></th>
<td bgcolor="#c5ffc4" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td>
<th bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid #C0C0C0;" align="right"><font size="2">2620</font></th>
<td bgcolor="#c5ffc4" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; ">                        /* successfully cancelled scheduler entry. */</pre></td>
</tr>
<tr>
<th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" align="right"><font size="2"></font></th>
<td bgcolor="#c5ffc4" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td>
<th bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid #C0C0C0;" align="right"><font size="2">2621</font></th>
<td bgcolor="#c5ffc4" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; ">                        ao2_ref(instance, -1);</pre></td>
</tr>
</tbody>
<tbody>
<tr>
<th bgcolor="#f0f0f0" style="border-right: 1px solid #C0C0C0;" align="right"><font size="2">2596</font></th>
<td bgcolor="#ffffff" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; ">        }</pre></td>
<th bgcolor="#f0f0f0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid #C0C0C0;" align="right"><font size="2">2622</font></th>
<td bgcolor="#ffffff" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; ">                }</pre></td>
</tr>
</tbody>
<tbody>
<tr>
<th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" align="right"><font size="2"></font></th>
<td bgcolor="#c5ffc4" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td>
<th bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid #C0C0C0;" align="right"><font size="2">2623</font></th>
<td bgcolor="#c5ffc4" width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; ">                rtp->rtcp->schedid = -1;</pre></td>
</tr>
</tbody>
</table>
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">There are macros for handling the ref counting of stuff associated with a scheduled event.</pre>
</blockquote>
</blockquote>
<pre style="margin-left: 1em; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Yeah ... I avoided it because AST_SCHED_DEL_UNREF() is even more evil than AST_SCHED_DEL(). It does the "loop 10 times" thing and also makes some assumptions about how the sched ID variable is handled and when it gets set to -1. I would rather either leave it this way or start writing a new set of helper macros that are less evil.</pre>
<br />
<p>- Russell</p>
<br />
<p>On September 18th, 2011, 9:20 p.m., Russell Bryant wrote:</p>
<table bgcolor="#fefadf" width="100%" cellspacing="0" cellpadding="8" style="background-image: url('https://reviewboard.asterisk.org/media/rb/images/review_request_box_top_bg.png'); background-position: left top; background-repeat: repeat-x; border: 1px black solid;">
<tr>
<td>
<div>Review request for Asterisk Developers.</div>
<div>By Russell Bryant.</div>
<p style="color: grey;"><i>Updated Sept. 18, 2011, 9:20 p.m.</i></p>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Description </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: 1px solid #b8b5a0">
<tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">I opened ASTERISK-18570 for this issue. The rest of the issues are ones that I have found so far that appear to be the same problem.
This patch addresses crashes related to RTCP handling. The backtraces just show a crash in ast_rtcp_write() where it appears that the RTP instance is no longer valid. There is a race condition with scheduled RTCP transmissions and the destruction of the RTP instance. This patch utilizes the fact that ast_rtp_instance is a reference counted object and ensures that it will not get destroyed while a reference is still around due to scheduled RTCP transmissions.
RTCP transmissions are scheduled and executed from the chan_sip scheduler context. This scheduler context is processed in the SIP monitor thread. The destruction of an RTP instance occurs when the associated sip_pvt gets destroyed (which happens when the sip_pvt reference count reaches 0). However, the SIP monitor thread is not the only thread that can cause a sip_pvt to get destroyed. The sip_hangup function, executed from a channel thread, also decrements the reference count on a sip_pvt and could cause it to get destroyed.
While this is being changed anyway, the patch also removes calling ast_sched_del() from within the RTCP scheduler callback. It's not helpful. Simply returning 0 prevents the callback from being rescheduled.</pre>
</td>
</tr>
</table>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Testing </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: 1px solid #b8b5a0">
<tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">This patch has been applied to a test environment with a couple of servers running Asterisk 1.8.7.0-rc1. The servers have processed over 1 million calls without hitting this crash.</pre>
</td>
</tr>
</table>
<div style="margin-top: 1.5em;">
<b style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Bugs: </b>
<a href="https://issues.asterisk.org/jira/browse/ASTERISK-13334">ASTERISK-13334</a>,
<a href="https://issues.asterisk.org/jira/browse/ASTERISK-15257">ASTERISK-15257</a>,
<a href="https://issues.asterisk.org/jira/browse/ASTERISK-15406">ASTERISK-15406</a>,
<a href="https://issues.asterisk.org/jira/browse/ASTERISK-17560">ASTERISK-17560</a>,
<a href="https://issues.asterisk.org/jira/browse/ASTERISK-18570">ASTERISK-18570</a>,
<a href="https://issues.asterisk.org/jira/browse/ASTERISK-9716">ASTERISK-9716</a>,
<a href="https://issues.asterisk.org/jira/browse/ASTERISK-9977">ASTERISK-9977</a>
</div>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Diffs</b> </h1>
<ul style="margin-left: 3em; padding-left: 0;">
<li>/branches/1.8/res/res_rtp_asterisk.c <span style="color: grey">(335789)</span></li>
</ul>
<p><a href="https://reviewboard.asterisk.org/r/1444/diff/" style="margin-left: 3em;">View Diff</a></p>
</td>
</tr>
</table>
</div>
</body>
</html>