[asterisk-dev] [Code Review]: Add option to check state when state is unknown and optomise the checking to reduce existing load.
Kevin Fleming
reviewboard at asterisk.org
Thu Oct 20 11:48:36 CDT 2011
On Oct. 20, 2011, 10:47 a.m., irroot wrote:
> > There appears to be some sort of race condition between when a device state is updated and when we try to ring a device. I don't understand how all this works well enough to know for sure, but even with this change it still seems possible that the device state could change between when we check here and when we actualy dial. It is impossible to get this 100% perfect, as we can't control who may be dialing the device outside of asterisk, but I suppose we could make this work right within asterisk. Either way, that is outside of the limited scope of this change.
> >
> > This change looks good. In my performance testing it has a much much lesser effect on performance than the previous code that was here. There are still some channel list walks involved, but they don't appear to be causing massive performance degradation like the previous change did.
I think you mean 'outside of app_queue' here, not 'outside of Asterisk'. Your analysis is correct, though; there is no 'atomic' method to test device status and then dial, nor is there any way to 'reserve' a device so that it won't be dialed through any other means until you've decided whether you want to dial it or not. There are ways such problems could be solved, but they would require some significant new APIs... and it seems that they are possibly past the point of diminishing returns, because the frequency that a call will be delivered to an otherwise busy device with the proposed changes in place should be quite small.
- Kevin
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/1535/#review4540
-----------------------------------------------------------
On Oct. 20, 2011, 5:49 a.m., irroot wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/1535/
> -----------------------------------------------------------
>
> (Updated Oct. 20, 2011, 5:49 a.m.)
>
>
> Review request for Asterisk Developers and Matthew Nicholson.
>
>
> Summary
> -------
>
> commit r325483 introduced a performance problem reverted by r341486
>
> here is a proposal to do this more "friendly" its a global option [possibly should be
> per Q] and uses ast_device_state that tries to get the state from a channel driver
> it calls ast_parse_device_state only if this fails to yield a result it then fires a
> devicestate event to set it so it does not get called over and over.
>
> the reality is its happening 4 out of 1000 times this check is run that the agent is busy
> this is a minimal % the customer who raised this issue is under strict SLA to answer calls
> and provide service they a medical aid call center who deal with e911 cases authorizing
> treatment.
>
> hope this rework is more acceptable ill be putting this in production and reporting results back.
> in the log files [rotated daily] from 15/09/2011 to 19/10/2011
>
> 110507 times was logged
> 516 duplicate calls prevented
>
> ill be watching these metrics closely in next bit.
>
> the problem is its not efficient at all as they only had 46598 calls come into the system.
>
> -------------
>
> The regression was caused by a call to ast_parse_device_state() in app_queue's
> ring_entry() function. The ast_parse_device_state() function eventually calls
> ast_channel_get_full() with a channel name prefix which causes it to walk the
> channel list causing massive lock contention and slow downs.
>
> This patch fixes the regression by removing the call to
> ast_parase_device_state() which should be unnecessary. Queue member device
> state should be maintained by device state events. Some users have seen
> instances where busy agents were called when they shouldn't have, which is the
> reason the call to ast_parse_device_state() was added. That change appears to
> have resolved that issue but also causes this performance regression. There may
> still be issues with queue member status, and if so, alternative methods should
> be investigated to resolve them.
>
>
> This addresses bug AST-695.
> https://issues.asterisk.org/jira/browse/AST-695
>
>
> Diffs
> -----
>
> /branches/10/apps/app_queue.c 341525
>
> Diff: https://reviewboard.asterisk.org/r/1535/diff
>
>
> Testing
> -------
>
>
> Thanks,
>
> irroot
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20111020/2b9ee1c3/attachment.htm>
More information about the asterisk-dev
mailing list