[asterisk-dev] [Code Review] 3548: suspended destructions of pri spans following PRI_EVENT_REMOVED

Tzafrir Cohen reviewboard at asterisk.org
Tue Jun 17 13:28:03 CDT 2014



> On June 16, 2014, 7:04 p.m., rmudgett wrote:
> > /trunk/channels/chan_dahdi.c, lines 1137-1139
> > <https://reviewboard.asterisk.org/r/3548/diff/2/?file=59691#file59691line1137>
> >
> >     You should not be directly accessing the .first and .last list members directly.  This is why I gave you the way it should be done earlier.
> 
> Tzafrir Cohen wrote:
>     I explained (and the comment in the code explains) why that does not work: destruction of the spans should not be done with the list lock held - this helps trigger a deadlock, as explained in the bug report. I solve this by moving all entries from a global list to a local list. That way, the lock global list's lock is not held on destruction and the local list doesn't need locking.
>     
>     linkedlist.h does not have AST_LIST_MOVE (I can add one). Alternatively, I can walk the list and move every single entry. But that just makes the code uglier and does more work under the lock.
> 
> rmudgett wrote:
>     Please look at the sample code I supplied again.  The list node is removed while the list is locked and the span is destroyed with the list not locked.  There will not be a deadlock as a result.  There is no need for an AST_LIST_MOVE() as a result.
>     
>     As for the concern of locking/unlocking the list.  How often are spans destroyed that this would be a performance concern?

OK. Got it.


- Tzafrir


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/3548/#review12159
-----------------------------------------------------------


On June 17, 2014, 9:10 a.m., Tzafrir Cohen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/3548/
> -----------------------------------------------------------
> 
> (Updated June 17, 2014, 9:10 a.m.)
> 
> 
> Review request for Asterisk Developers and rmudgett.
> 
> 
> Bugs: ASTERISK-23554
>     https://issues.asterisk.org/jira/browse/ASTERISK-23554
> 
> 
> Repository: Asterisk
> 
> 
> Description
> -------
> 
> Issue: when a PRI span is disconnected (e.g.: following the unassignment pri spans) dahdi channels of that span can be destroyed in two different pathes:
> 
> 1. DAHDI channels are destroyed in response to pri_event_removed
> 2. The span is destroyed in response to DAHDI_EVENT_REMOVED in the D-channel. Before the span is destroyed, its channels need to be destroyed.
> 
> If the channel is not in a call, (1) is run from the monitor thread, holding the iflock (lock of iflist: the list of channels). somewhere in the process of destroying a channel that belongs to a PRI
> span, the pri's lock needs to be acquired.
> 
> (2) is called from a context of handling the PRI events and hence holds the PRI lock. Destroying the channels requires getting the iflock.
> 
> Which means that if the two happen simultaneously, we have a deadlock. And the two will happen simultaneously, as recent versions of DAHDI will send an extra DAHDI_EVENT_REMOVED as a response to any call to the ioctl on DAHDI_GET_EVENT on a removed span.
> 
> This review includes the patches pri_destroy_span_prilist.patch and sigpri_handle_enodev_1.patch from the referred bug. The former solves this deadlock by creating a list of spans to be removed "later" and and thus allow executing (2) without holding the pri lock.
> 
> The second patch fixes error handling of libpri: if read returns -ENODEV, we have no device and it should be destroyed. This, however, requires exposing the above "deferred destruction" functionality to sig_pri.
> 
> 
> Diffs
> -----
> 
>   /trunk/channels/sig_pri.c 416393 
>   /trunk/channels/sig_pri.h 416393 
>   /trunk/channels/chan_dahdi.c 416393 
> 
> Diff: https://reviewboard.asterisk.org/r/3548/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Tzafrir Cohen
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20140617/47ff0081/attachment.html>


More information about the asterisk-dev mailing list