[asterisk-dev] dahdi_device representation

Tilghman Lesher tlesher at digium.com
Fri Aug 27 21:04:06 CDT 2010


On Friday 27 August 2010 17:54:45 Oron Peled wrote:
> During Shaun Ruffell adventures with DAHDI persistant channel assignments
> he started implementing a very important feature (IMO) -- a representation
> of dahdi_device that represent a collection of spans.
>
> Lack of this representation caused hardware attributes to be
> duplicated in the spans while in reality they should be represented
> in the dahdi_device (e.g: location)
>
> I would like to use this opportunity ot present a relevant issue that
> may IMO affect the design:
>
> 1. Historically, chan_dahdi was not made for hot-pluggable
>    devices.
>
> 2. As a result. After a successfull open(), chan_dahdi ignore
> read()/write() errors (except for the special errno used to pass events).
>
> 3. This means that if a device is removed under chan_dahdi feet it
>    goes to an infinite tight failed read() loop which usually make the
>    host unresponsive after a few seconds (except of the kernel)
>    because asterisk usually runs at real-time priority.
>
> 4. Since Astribanks were always hot-plugabble, we "solved" this problem
>    by employing various measures in our xpp drivers:
>    - When a device is removed, we *keep* its data structure intact and
>       make a note to ourselves that it's disconnected.
>    - We send a red alarm to asterisk for disconnected devices, trying
>       to squelch some of the "noise".
>    - We ignore asterisk calls for disconnected devices.
>    - We added a "REMOVED" event to asterisk, politely asking it to remove
>       a span with all its channels.
>    - We refcount the opne/close so if/when asterisk is nice and actually
> close all channels, we can actually release the data structures.
>
>    BTW: only lately (during dial-byname development) we managed to fix
>            asterisk so removing a digital span would also close its dchan.
>
> 5. Obviously, keeping "ghost" devices around so we don't surprise asterisk
>     is not a very good design, but we didn't see any alternatives at the
> time.
>
> If chan_dahdi is not made aware to driver errors (e.g: -ENODEV), similar
> ugly techniques would be needed for hot-plug implementation at the DAHDI
> level. This has some design consequences for the sysfs object layout and
> therefore should be thought about early.
>
> So the question is short:
>    Should DAHDI account for and work around chan_dahdi ignorance?
>    Or should chan_dahdi be fixed first?

Yes, DAHDI will need to work around this, since we cannot ensure that each
Asterisk installation will upgrade the userland piece to a version which is
sufficient to work around the problem.  One question, though.  If this is
fixed in both locations, what method would you prefer to communicate that
chan_dahdi has been fixed, and DAHDI doesn't need to employ the work
around?  Or would you prefer to simply keep the workaround active in DAHDI
regardless of whether it is necessary for chan_dahdi?  Perhaps it would be
sufficient to detect the poor behavior (multiple successive read()s which
fail) and employ the workaround only in that case.

-- 
Tilghman Lesher
Digium, Inc. | Senior Software Developer
twitter: Corydon76 | IRC: Corydon76-dig (Freenode)
Check us out at: www.digium.com & www.asterisk.org



More information about the asterisk-dev mailing list