[asterisk-dev] [Code Review] 3548: suspended destructions of pri spans following PRI_EVENT_REMOVED

Tzafrir Cohen reviewboard at asterisk.org
Mon Jun 16 11:07:50 CDT 2014


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/3548/
-----------------------------------------------------------

(Updated June 16, 2014, 4:07 p.m.)


Review request for Asterisk Developers and rmudgett.


Changes
-------

Thanks for the feedback. Fixed most (or all) issues.

I still encountered one or two cases (unreproducable) where the lock on a pri (span) was held for over 3 minutes and thus it appeared as if there was a deadlock. It eventually got released on its own.


Bugs: ASTERISK-23554
    https://issues.asterisk.org/jira/browse/ASTERISK-23554


Repository: Asterisk


Description
-------

Issue: when a PRI span is disconnected (e.g.: following the unassignment pri spans) dahdi channels of that span can be destroyed in two different pathes:

1. DAHDI channels are destroyed in response to pri_event_removed
2. The span is destroyed in response to DAHDI_EVENT_REMOVED in the D-channel. Before the span is destroyed, its channels need to be destroyed.

If the channel is not in a call, (1) is run from the monitor thread, holding the iflock (lock of iflist: the list of channels). somewhere in the process of destroying a channel that belongs to a PRI
span, the pri's lock needs to be acquired.

(2) is called from a context of handling the PRI events and hence holds the PRI lock. Destroying the channels requires getting the iflock.

Which means that if the two happen simultaneously, we have a deadlock. And the two will happen simultaneously, as recent versions of DAHDI will send an extra DAHDI_EVENT_REMOVED as a response to any call to the ioctl on DAHDI_GET_EVENT on a removed span.

This review includes the patches pri_destroy_span_prilist.patch and sigpri_handle_enodev_1.patch from the referred bug. The former solves this deadlock by creating a list of spans to be removed "later" and and thus allow executing (2) without holding the pri lock.

The second patch fixes error handling of libpri: if read returns -ENODEV, we have no device and it should be destroyed. This, however, requires exposing the above "deferred destruction" functionality to sig_pri.


Diffs (updated)
-----

  /trunk/channels/sig_pri.c 416393 
  /trunk/channels/sig_pri.h 416393 
  /trunk/channels/chan_dahdi.c 416393 

Diff: https://reviewboard.asterisk.org/r/3548/diff/


Testing
-------


Thanks,

Tzafrir Cohen

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20140616/9d8f4bc1/attachment.html>


More information about the asterisk-dev mailing list