[asterisk-dev] [Code Review] [18031] Patch for deadlock locking order issue between channel/queue during set_queue_variables

Brett Bryant reviewboard at asterisk.org
Thu Nov 18 14:21:22 CST 2010


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/1018/
-----------------------------------------------------------

Review request for Asterisk Developers.


Summary
-------

When setqueuevar is enabled in app_queue, the potential exists for one thread to lock a call_queue structure with a channel lock held while another thread locks the channel with the call_queue lock held. The resulting deadlock will end up crippling Asterisk since quite a few parts of Asterisk will traverse the channel list.

At several points in app_queue (in my case, end_bridge_callback), a queue will be locked before calling set_queue_variables (which pulls data directly out of the queue structure); this is fine, but when setqueuevar is actually enabled, the call to pbx_builtin_setvar_multiple will lock the channel. Meanwhile, another thread may be calling ast_do_masquerade() to clone the same channel, resulting in a lock on the same channel (clonechan.) Later in the masquerade, the datastore fixup functions are called, and queue_transfer_fixup's call to update_queue() will result in an attempt to lock the related queue (or all queues if shared_lastcall is enabled.)

This deadlock occurred for me on 1.6.2.6 but appears to present in 1.6.2.13 as well. In my case, do_masquerade() got the channel lock while end_bridge_callback() got the queue lock.

Queue config:
setqueuevar = yes
shared_lastcall = yes

All agents in the queue were Local channels. Unfortunately, I lost the core I forced from this deadlock, so I can't provide backtraces for the two deadlocked threads. I can tell you that they were (roughly)...

Thread A: ... -> ast_bridge_call -> app_queue.c end_bridge_callback [queue lock happens here] -> set_queue_variables -> pbx_builtin_setvar_multiple -> pbx_builtin_setvar_helper -> ast_channel_lock

Thread B: ... -> ast_do_masquerade [channel lock happens here] -> queue_transfer_fixup -> update_queue -> ao2_lock


This addresses bug 18031.
    https://issues.asterisk.org/view.php?id=18031


Diffs
-----

  /trunk/apps/app_queue.c 294084 

Diff: https://reviewboard.asterisk.org/r/1018/diff


Testing
-------

Queue testsuite tests ran before and after applying the patch.


--> queues/queue_baseline --- PASSED
--> queues/position_priority_maxlen --- PASSED
--> queues/macro_gosub_test --- PASSED


Thanks,

Brett

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.digium.com/pipermail/asterisk-dev/attachments/20101118/10e21736/attachment.htm 


More information about the asterisk-dev mailing list