[asterisk-dev] [Code Review] 2629: Fix rapid growth problem in thread pools

Mark Michelson reviewboard at asterisk.org
Sun Jun 16 18:08:32 CDT 2013


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/2629/
-----------------------------------------------------------

Review request for Asterisk Developers.


Repository: Asterisk


Description
-------

Joshua Colp noticed while doing some performance testing of chan_pjsip that sometimes the thread count in the threadpool would suddenly spike (e.g. going from 26 threads to 2600 threads) when processing a high number of calls per second. The reason for this is due to the threadpool's control queue, coupled with the fact that new threads in the threadpool start in the 'active' state.

Consider a scenario where a threadpool's threads are all active, and 25 tasks all get queued into the threadpool in succession. When the first task gets evaluated, the threadpool sees that there are no idle threads, so it grows by its configured autoincrement value (we'll say 5 for this case). All 5 new threads start active. Now the next queued task is evaluated. The threadpool sees there are still no idle threads (since the 5 new threads are all active), so the threadpool again increments the number of threads by 5. This repeats until all 25 tasks have been evaluated. In this scenario, the threadpool's size has increased by 125 threads (5 * 25) very quickly. Obviously, the vast majority of these new threads will go idle immediately, so why doesn't this affect the count of idle threads when evaluating the incoming tasks? It's because when a thread becomes idle, the task to make the thread idle gets queued behind the other tasks already waiting. So once the 25 threadpool tasks are evaluated, it's likely that there will be ~100 "thread has become idle" tasks immediately following.

This patch aims to fix the above scenario. The modifications are as follows:

* New worker threads start idle instead of active.
* If there are no idle threads when a new task arrives, and the threadpool auto-increments the thread count, then the threadpool will activate a single idle thread.
* If the threadpool's thread count is increased manually (via ast_threadpool_set_size), then all idle threads will be activated once the new threads are added. The justification for this is that the most likely scenario when ast_threadpool_set_size() is called is when a threadpool listener has determined that the number of tasks has greatly outnumbered the number of threads to handle them. Thus the listener wishes to increase the threadpool size and have all the new threads immediately get to work to clear out the backlog of tasks. Activating 

With the first two modifications, let's re-examine the earlier problem scenario.

All threadpool threads are active, and 25 tasks all get queued into the threadpool in succession. When the first task gets evaluated, the threadpool sees that there are no idle threads, so it grows by 5 threads. The 5 threads start idle, but one is activated in order to address the task being added. Now the next queued task is evaluated. There are 4 idle threads, so the threadpool simply activates one and continues. This process will continue until all tasks in the queue have been evaluated. This time, the threadpool only grew by 25 threads, which is just enough to get the tasks completed.

This fix is not a perfect solution since some of the active threads possibly became idle during the time that the threadpool was evaluating the 25 queued tasks. Therefore it could be possible not to have to increase the threadpool size by as many threads. However, the fix for that would likely require API changes to taskprocessors. The effort to benefit ratio is not clear at this point since it is not as easy to predict how many threads might go idle during a long chain of queued tasks. If further testing shows that the thread increase in such a scenario is still too large, then we can look into further efforts to minimize threadpool growth.


Diffs
-----

  /trunk/main/threadpool.c 391698 

Diff: https://reviewboard.asterisk.org/r/2629/diff/


Testing
-------

The threadpool unit tests still pass.


Thanks,

Mark Michelson

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20130616/0d45a8f1/attachment.htm>


More information about the asterisk-dev mailing list