[asterisk-dev] [Code Review] Resolve crash from orphaned MWI subscriptions

David Vossel reviewboard at asterisk.org
Tue Dec 6 10:33:40 CST 2011


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/1610/#review4929
-----------------------------------------------------------

Ship it!


My ship it reflects that I have reviewed all the code and agree with all the changes made here except for the few lines I commented about.  If you decide to take those lines out, I'm good with this going in.  Otherwise we need further discussion so I can better understand the purpose those lines serve.

Great work!  This was a complex one :)


/branches/1.8/channels/chan_sip.c
<https://reviewboard.asterisk.org/r/1610/#comment9195>

    I know we talked about this briefly, but I still have reservations about this.
    
    Just to be clear for historical reasons.  The use of ref counting here does nothing to give the event thread ownership of a reference to the peer.  If the event thread does not have a reference handed to it at subscription time, then adding the reference here in the callback will not do anything to prevent the peer from being destroyed as it may have already been destroyed before we even add the ref count.
    
    Also, given that the un-subscription to this event occurs during the peer's ao2 destructor callback, we actually run the risk of adding and removing a reference to the peer while it is in the destructor callback... I really don't know what that will do.
    
    From what I remember, these lines were added for debugging purposes in order to determine if the peer was already destroyed (we'll get the "bad magic number" error when we try to ref it if it is already destroyed).  I'm not sure if this gives us anything new though, as the ao2_lock in sip_send_mwi_to_peer() should offer the same debug information.


- David


On Dec. 6, 2011, 10:10 a.m., mjordan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/1610/
> -----------------------------------------------------------
> 
> (Updated Dec. 6, 2011, 10:10 a.m.)
> 
> 
> Review request for Asterisk Developers, David Vossel and opticron.
> 
> 
> Summary
> -------
> 
> ASTERISK-18663 originally manifested as a deadlock when setting 'allowsubscribe=yes', 'callercounter = yes' and setting the subscribecontext in chan_sip.  When the deadlock was resolved by r345063, a crash would occur in chan_sip.  This would manifest when an MWI notification was to be sent to a peer, but the peer had been deleted due to being dereferenced to a ref count of 0.  The root cause of this ended up being the MWI event subscription being resubscribed to in several places, and orphaning the previous event subscription.  When an MWI event would occur, all of the event subscriptions (including the orphaned subscriptions) would be notified.  This didn't cause any issues until a peer was removed, either by pruning realtime SIP peers, unloading chan_sip, etc.  When the peer cleaned itself up, it only removes the subscription that it's aware of - the orphaned subscriptions would continue to exist and, if a new MWI event occurred, would crash Asterisk by referencing the deleted peer.
> 
> This patch does several things:
> 1. It resolves the issue in subscribing to the MWI event callback by first unsubscribing the old event subscription
> 2. It more aggressively holds the authpeer in handle_request_subscribe and removes some unneeded peer ref'ing / deref'ing.  This was done more for clarity, as the previous location of deref'ing the authpeer ignored that the relatedpeer, set to the authpeer, was still used later in the method
> 3. It fixes a potential bug wherein an authentication result could be positive, but all failures are assumed to be negative
> 
> 
> This addresses bug ASTERISK-18663.
>     https://issues.asterisk.org/jira/browse/ASTERISK-18663
> 
> 
> Diffs
> -----
> 
>   /branches/1.8/channels/chan_sip.c 347057 
>   /branches/1.8/channels/sip/include/sip.h 347057 
> 
> Diff: https://reviewboard.asterisk.org/r/1610/diff
> 
> 
> Testing
> -------
> 
> Testing was done extensively using 1.8 and 1.8.8.0-rc4.  This included using two SIP phones with BLF and MWI subscriptions, with multiple mailboxes defined for various extensions, and module unloading / reloading chan_sip at various times (both before SUBSCRIBE messages were received and after multiple SUBSCRIBE messages had been recevied).  The patch was also confirmed to resolve the issue by the issue reporter.
> 
> 
> Thanks,
> 
> mjordan
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20111206/bf846047/attachment.htm>


More information about the asterisk-dev mailing list