[asterisk-dev] [Code Review] [18031] Patch for deadlock locking order issue between channel/queue during set_queue_variables

Russell Bryant reviewboard at asterisk.org
Thu Nov 18 14:50:37 CST 2010


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/1018/#review2936
-----------------------------------------------------------

Ship it!


Looks sane to me

- Russell


On 2010-11-18 14:21:22, Brett Bryant wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/1018/
> -----------------------------------------------------------
> 
> (Updated 2010-11-18 14:21:22)
> 
> 
> Review request for Asterisk Developers.
> 
> 
> Summary
> -------
> 
> When setqueuevar is enabled in app_queue, the potential exists for one thread to lock a call_queue structure with a channel lock held while another thread locks the channel with the call_queue lock held. The resulting deadlock will end up crippling Asterisk since quite a few parts of Asterisk will traverse the channel list.
> 
> At several points in app_queue (in my case, end_bridge_callback), a queue will be locked before calling set_queue_variables (which pulls data directly out of the queue structure); this is fine, but when setqueuevar is actually enabled, the call to pbx_builtin_setvar_multiple will lock the channel. Meanwhile, another thread may be calling ast_do_masquerade() to clone the same channel, resulting in a lock on the same channel (clonechan.) Later in the masquerade, the datastore fixup functions are called, and queue_transfer_fixup's call to update_queue() will result in an attempt to lock the related queue (or all queues if shared_lastcall is enabled.)
> 
> This deadlock occurred for me on 1.6.2.6 but appears to present in 1.6.2.13 as well. In my case, do_masquerade() got the channel lock while end_bridge_callback() got the queue lock.
> 
> Queue config:
> setqueuevar = yes
> shared_lastcall = yes
> 
> All agents in the queue were Local channels. Unfortunately, I lost the core I forced from this deadlock, so I can't provide backtraces for the two deadlocked threads. I can tell you that they were (roughly)...
> 
> Thread A: ... -> ast_bridge_call -> app_queue.c end_bridge_callback [queue lock happens here] -> set_queue_variables -> pbx_builtin_setvar_multiple -> pbx_builtin_setvar_helper -> ast_channel_lock
> 
> Thread B: ... -> ast_do_masquerade [channel lock happens here] -> queue_transfer_fixup -> update_queue -> ao2_lock
> 
> 
> This addresses bug 18031.
>     https://issues.asterisk.org/view.php?id=18031
> 
> 
> Diffs
> -----
> 
>   /trunk/apps/app_queue.c 294084 
> 
> Diff: https://reviewboard.asterisk.org/r/1018/diff
> 
> 
> Testing
> -------
> 
> Queue testsuite tests ran before and after applying the patch.
> 
> 
> --> queues/queue_baseline --- PASSED
> --> queues/position_priority_maxlen --- PASSED
> --> queues/macro_gosub_test --- PASSED
> 
> 
> Thanks,
> 
> Brett
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.digium.com/pipermail/asterisk-dev/attachments/20101118/a9dd45ee/attachment.htm 


More information about the asterisk-dev mailing list