[asterisk-dev] [Code Review] Fix deadlock between subscription event RWLOCK and dialogs container lock in chan_sip.

David Vossel reviewboard at asterisk.org
Tue Nov 8 11:50:52 CST 2011


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/1557/#review4729
-----------------------------------------------------------



/branches/10/channels/chan_sip.c
<https://reviewboard.asterisk.org/r/1557/#comment8911>

    How would it be possible for something to be linked twice into the needdestroy container?  The needdestroy variable should never be set outside of this function, and should definitely never have the potential to be turned off after being set. If this is the case, that is the problem that should be addressed... I see that we do that in sip_hangup.  That is wrong.  We can never guarantee that setting the pvt->needdestroy variable back to 0 after it is already set will stop anything. That is a nasty race condition that no logic should ever depend on.



/branches/10/channels/chan_sip.c
<https://reviewboard.asterisk.org/r/1557/#comment8912>

    Is p locked during these links/unlinks.  If so this is a problem.


- David


On Nov. 8, 2011, 11:19 a.m., rmudgett wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/1557/
> -----------------------------------------------------------
> 
> (Updated Nov. 8, 2011, 11:19 a.m.)
> 
> 
> Review request for Asterisk Developers, David Vossel and schmidts.
> 
> 
> Summary
> -------
> 
> Timing between dialog destruction and a MWI event sending a message could result in a deadlock.
> 
> Order of events causing deadlock:
> 
> 1a) The event subscription system calls the registered callbacks with its list RWLOCK held.
> 1b) The SIP monitor checks for dialogs needing destruction.  It does an ao2_callback that holds the dialogs container lock while searching for dialogs to destroy.
> 2a) The event subscription SIP callback needs to create a temporary dialog to send out the MWI notification.  That temporary dialog needs to be inserted in the dialogs container so it must wait.
> 2b) The dialog search finds a dialog to destroy and as a result releases the last reference for a peer.  The peer destructor attempts to get the subscription RWLOCK but must wait.
> 3) deadlock
> 
> 
> Residual changes for Asterisk v10 branch after https://reviewboard.asterisk.org/r/1564/ commit
> and associated dialogs callid hash key change fix.
> 
> NOTE THE CHANGE FROM v1.8 to v10 BRANCH.
> 
> * Make check_rtp_timeout() return CMP_MATCH if need to delete dialog from
> dialogs_rtpcheck.  This is an optimization to avoid an unneeded
> lock/unlock and object search when using ao2_unlink.
> 
> * Prevent crash in check_rtp_timeout() if dialog->rtp is NULL.
> 
> * Make pvt_set_needdestroy() protect from possible double entries in
> dialogs_needdestroy.
> 
> Schmidts please note that change_callid_pvt() is different between v1.8
> and v10 for your unleas-the-beast branch.
> 
> 
> Diffs
> -----
> 
>   /branches/10/channels/chan_sip.c 343850 
> 
> Diff: https://reviewboard.asterisk.org/r/1557/diff
> 
> 
> Testing
> -------
> 
> It compiles. :)
> 
> 
> Thanks,
> 
> rmudgett
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20111108/ef1dab9e/attachment-0001.htm>


More information about the asterisk-dev mailing list