[asterisk-dev] [Code Review] Fix deadlock between subscription event RWLOCK and dialogs container lock in chan_sip.

rmudgett reviewboard at asterisk.org
Mon Nov 7 16:39:18 CST 2011

(Updated Nov. 7, 2011, 4:39 p.m.)

Review request for Asterisk Developers, David Vossel and schmidts.

Summary (updated)

Timing between dialog destruction and a MWI event sending a message could result in a deadlock.

Order of events causing deadlock:

1a) The event subscription system calls the registered callbacks with its list RWLOCK held.
1b) The SIP monitor checks for dialogs needing destruction.  It does an ao2_callback that holds the dialogs container lock while searching for dialogs to destroy.
2a) The event subscription SIP callback needs to create a temporary dialog to send out the MWI notification.  That temporary dialog needs to be inserted in the dialogs container so it must wait.
2b) The dialog search finds a dialog to destroy and as a result releases the last reference for a peer.  The peer destructor attempts to get the subscription RWLOCK but must wait.
3) deadlock

Residual changes for Asterisk v10 branch after https://reviewboard.asterisk.org/r/1564/ commit
and associated dialogs callid hash key change fix.


* Make check_rtp_timeout() return CMP_MATCH if need to delete dialog from
dialogs_rtpcheck.  This is an optimization to avoid an unneeded
lock/unlock and object search when using ao2_unlink.

* Prevent crash in check_rtp_timeout() if dialog->rtp is NULL.

* Make pvt_set_needdestroy() protect from possible double entries in

Schmidts please note that change_callid_pvt() is different between v1.8
and v10 for your unleas-the-beast branch.

Diffs (updated)

  /branches/10/channels/chan_sip.c 343715 

