[asterisk-dev] [Code Review]: Fix deadlock between subscription event RWLOCK and dialogs container lock in chan_sip.
David Vossel
reviewboard at asterisk.org
Tue Nov 8 14:29:43 CST 2011
> On Nov. 8, 2011, 11:50 a.m., David Vossel wrote:
> > /branches/10/channels/chan_sip.c, lines 11738-11746
> > <https://reviewboard.asterisk.org/r/1557/diff/5/?file=21655#file21655line11738>
> >
> > Is p locked during these links/unlinks. If so this is a problem.
>
> rmudgett wrote:
> I see what you mean. Deadlock between dialogs_rtpcheck container lock and the dialog p lock. Actually, p is not always locked here which is itself not good.
>
> The dialogs_rtpcheck container does not care about the p lock. The use of the dialogs_rtpcheck container is as a subset of the dialogs container. There are two places where locking order could have been a problem:
> 1) In dialog_checkrtp_cb() but that function does a trylock instead of a lock.
> 2) In change_callid_pvt().
>
> Looking at the change_callid_pvt() function will potentially violate locking order between dialogs and dialog/pvt. It is sometimes called with pvt locked. If it is not called with pvt locked then it is changing the dialog callid without the lock. The change_callid_pvt() routine was created from repeated code sequences so it would not be a new deadlock potential.
>
> Locking in chan_sip seems to be really messy. :)
I see why we get away with it. We do a try lock in check rtp callback on the sip pvt while the dialogs_rtpcheck container is locked.
- David
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/1557/#review4729
-----------------------------------------------------------
On Nov. 8, 2011, 1:42 p.m., rmudgett wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/1557/
> -----------------------------------------------------------
>
> (Updated Nov. 8, 2011, 1:42 p.m.)
>
>
> Review request for Asterisk Developers, David Vossel and schmidts.
>
>
> Summary
> -------
>
> Timing between dialog destruction and a MWI event sending a message could result in a deadlock.
>
> Order of events causing deadlock:
>
> 1a) The event subscription system calls the registered callbacks with its list RWLOCK held.
> 1b) The SIP monitor checks for dialogs needing destruction. It does an ao2_callback that holds the dialogs container lock while searching for dialogs to destroy.
> 2a) The event subscription SIP callback needs to create a temporary dialog to send out the MWI notification. That temporary dialog needs to be inserted in the dialogs container so it must wait.
> 2b) The dialog search finds a dialog to destroy and as a result releases the last reference for a peer. The peer destructor attempts to get the subscription RWLOCK but must wait.
> 3) deadlock
>
>
> Residual changes for Asterisk v10 branch after https://reviewboard.asterisk.org/r/1564/ commit
> and associated dialogs callid hash key change fix.
>
> NOTE THE CHANGE FROM v1.8 to v10 BRANCH.
>
> * Make check_rtp_timeout() return CMP_MATCH if need to delete dialog from
> dialogs_rtpcheck. This is an optimization to avoid an unneeded
> lock/unlock and object search when using ao2_unlink.
>
> * Prevent crash in check_rtp_timeout() if dialog->rtp is NULL.
>
> * Make pvt_set_needdestroy() protect from possible double entries in
> dialogs_needdestroy.
>
> Schmidts please note that change_callid_pvt() is different between v1.8
> and v10 for your unleas-the-beast branch.
>
>
> Diffs
> -----
>
> /branches/10/channels/chan_sip.c 343951
>
> Diff: https://reviewboard.asterisk.org/r/1557/diff
>
>
> Testing
> -------
>
> It compiles. :)
>
>
> Thanks,
>
> rmudgett
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20111108/0b02b076/attachment-0001.htm>
More information about the asterisk-dev
mailing list