[asterisk-dev] Pre-requesites of ast_cel_report_event breached in handle_request_refer (non-permitted lock)

Matthew Jordan mjordan at digium.com
Wed Aug 27 15:48:42 CDT 2014


On Wed, Aug 27, 2014 at 12:32 PM, Dave WOOLLEY <dave.woolley at bts.co.uk> wrote:
> Walter Doekes <walter+asterisk-dev at osso.nl> wrote:
>
>> AFAICT, local_attended_transfer does that as well (locked by
>
> [[djw]]
> And it looks like handle_invite_replaces does, although the place I've found has both relevant channels locked, so may be safe.
>
>> owner_chan_ref = sip_pvt_lock_full(p) in handle_request_do or by
>> get_sip_pvt_byid_locked, or both):
>>
>> > /* both p and p->owner _MUST_ be locked while calling
>> > local_attended_transfer */ if ((res = local_attended_transfer(p,
>> > &current, req, seqno, nounlock))) {
>> ...
>> > /* target.chan1 was locked in get_sip_pvt_byid_locked, do not unlock
>> > target.chan1 before this */ ast_cel_report_event(target.chan1,
>> > AST_CEL_ATTENDEDTRANSFER, NULL, transferer_linkedid, target.chan2);
>>
>> And that in turn causes a deadlock when the AST_CEL_BRIDGE_UPDATE
>> event is simultaneously fired by a different thread (ast_do_masquerade).
>>
>> We've had to disable BRIDGE_UPDATE cel events for now as a workaround.
>>
>
> Is there a formal issue report for this?  I was reluctant to raise one because it was picked up reading the code of a version we don't use, rather than in anger.  It seems that you've actually identified a specific real world race condition.

Not that I can find. If someone would file one, that'd be great.

> Also, it seems to me that one could avoid locking issues by passing addresses in the event, and having the accounting software, which is presumably single thread, match those up.  If a channel name is actually needed, for a peer,  rather than just an identifier for the structure, couldn't the accounting software look that up.  In that, case, if anything the channel should be locked, either on entry, or immediately after entry, to ensure that masquerading, etc., didn't change anything between when the event data was collected and when the event was queued, so that the events are guaranteed to be in a causally consistent order.

That's actually how CEL works in Asterisk 12+.

Channel drivers (and the parts of Asterisk that work on ast_channel
objects) publish messages about the channels to Stasis. Those messages
do *not* contain ast_channel objects; rather, they contain snapshots
of the information on the channel. CEL - along with CDRs, AMI, ARI,
and some other things - consume those messages and construct the parts
of the world that they care about. At least when listening for changes
in state (and not affecting state themselves), these consumers never
touch live data.

-- 
Matthew Jordan
Digium, Inc. | Engineering Manager
445 Jan Davis Drive NW - Huntsville, AL 35806 - USA
Check us out at: http://digium.com & http://asterisk.org



More information about the asterisk-dev mailing list