[asterisk-dev] [Code Review] BLF Subscriptions Causes SIP Deadlock

mjordan reviewboard at asterisk.org
Fri Nov 4 12:35:29 CDT 2011


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/1563/#review4687
-----------------------------------------------------------



trunk/channels/chan_sip.c
<https://reviewboard.asterisk.org/r/1563/#comment8877>

    For (somewhat) consistency, use ref_peer / unref_peer and provide a tag with what is occurring when the peer count is increased.  It'll help with debugging if the ref counts get off.
    
    I still have a sneaky suspicion that there's a highly unlikely and finicky race condition in here, where a peer could get ref'd down to 0 on some other thread, and be past the current_value checks in ao2obj but before clear_peer_mailbox in its destructor, and an MWI event (which is on a separate thread) could fire and raise the ref count here.  In no place is the object locked.  Raising the ref count would put current_value back up to 1, so it would look like the object is still okay, but meanwhile it starts getting destroyed.  This is pretty unlikely since peers don't go away very often and you'd have to hit the instructions on both threads at just the right point, but still possible, particularly in a SIP reload.
    
    I'm not sure what the solution for that would be however.  We can't just increase the ref count on the peer when you subscribe to MWI as that would mean that the peer would never go away, since the peer destruction is responsible for clearing the subscription.  It would have probably been safer to cache the peer's identifier and retrieve the peer in the callback from the thread safe container, rather then storing a pointer to the peer in the first place.
    
    This is probably something that can't be fixed by this patch however.



trunk/channels/chan_sip.c
<https://reviewboard.asterisk.org/r/1563/#comment8879>

    This isn't in my 1.8 version of chan_sip.  There is something similar to this in 10 however - what version are you working off of?
    
    This is probably going to spark a discussion on the locking order again, but when I thought about this for awhile, I got confused as to why we have to unlock / lock p here, since it isn't used by sip_send_mwi_to_peer?  I can see it needing to be unlocked if the method had to alter peer->call, but I can't find anywhere that occurs.  Now, if p has been set to be peer->mwipvt then we'd probably have to unlock it, but in that case it'd be clearer to explicitly unlock that object.  I don't think p == peer->mwipvt here however.
    
    There may be a good reason for this, but I would have thought we only need to worry about the unlocking / locking of the sip_pvt if we are going to affect it somehow, or if there's a chance that the peer associated with it is going to affect it.



trunk/channels/chan_sip.c
<https://reviewboard.asterisk.org/r/1563/#comment8880>

    Again, I'm confused why the sip_pvt needs to be unlocked / locked here, since add_peer_mwi_subs doesn't affect it.



trunk/channels/chan_sip.c
<https://reviewboard.asterisk.org/r/1563/#comment8881>

    Same here with respect to unlocking / locking p


- mjordan


On Nov. 4, 2011, 11:16 a.m., opticron wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/1563/
> -----------------------------------------------------------
> 
> (Updated Nov. 4, 2011, 11:16 a.m.)
> 
> 
> Review request for Asterisk Developers and David Vossel.
> 
> 
> Summary
> -------
> 
> Fix a locking inversion that causes deadlocks during BLF subscriptions.
> 
> 
> This addresses bug ASTERISK-18663.
>     https://issues.asterisk.org/jira/browse/ASTERISK-18663
> 
> 
> Diffs
> -----
> 
>   trunk/channels/chan_sip.c 342600 
> 
> Diff: https://reviewboard.asterisk.org/r/1563/diff
> 
> 
> Testing
> -------
> 
> Compile
> 
> 
> Thanks,
> 
> opticron
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20111104/741c3283/attachment.htm>


More information about the asterisk-dev mailing list