[asterisk-dev] [Code Review] Improve scheduler performance under high load
Mark Michelson
mmichelson at digium.com
Mon Feb 16 11:57:23 CST 2009
> On 2009-02-16 11:16:16, Kevin Fleming wrote:
> > /trunk/main/sched.c, lines 255-256
> > <http://reviewboard.digium.com/r/160/diff/5/?file=2721#file2721line255>
> >
> > Now I am thinking that maybe ast_heap_create should actually be a macro, that does the offsetof() automatically, by accepting the _type_ of the entries to be into the heap and the name of the index field inside the entry. I can't think of a case where a user of this API would *not* use offsetof() in exactly this way.
>
> wrote:
> Hmm, is there a way that this could be done and still allow for the index_offset to be an optional parameter? Aside from having two separate allocation functions/macros, I mean.
>
> wrote:
> No, there isn't, but I think the value of having the entry type and field name as part of the API would offset the small loss of it being optional. I can't really see that many cases where it being optional would be all that useful.
It's nice because you don't have to dedicate space in your structure for some other part of the code to mess with. If you wanted to start storing a particular structure in a heap and also wanted to maintain ABI compatibility, it would not be possible without making the index_offset optional. For cases where all that's necessary are push/pop operations from a heap, then it seems odd to have to add a field to your structure that isn't actually going to be used for anything.
- Mark
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviewboard.digium.com/r/160/#review436
-----------------------------------------------------------
On 2009-02-16 08:32:55, Russell Bryant wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://reviewboard.digium.com/r/160/
> -----------------------------------------------------------
>
> (Updated 2009-02-16 08:32:55)
>
>
> Review request for Asterisk Developers.
>
>
> Summary
> -------
>
> This patch implements the binary max-heap data structure and makes use of it in the Asterisk scheduler. There are also two new test modules which we were used for development and testing of this code. The heap test module verifies that the heap is behaving as expected. The scheduler test module times how long it takes to do adds and deletes from a scheduler context.
>
> More information about the heap data structure can be found here: http://en.wikipedia.org/wiki/Heap_(data_structure)
>
> In the rest of the description and testing discussion, let N be the number of entries in the scheduler.
>
> The data structure that has been replaced in sched.c is a sorted doubly linked list (DLL).
>
> With the DLL, adding an entry into the scheduler runs in O(N) time. If the existing entries have an even distribution over time, then the average case is that N/2 element comparisons will be required to find the proper place to insert the new entry. In the worst case, it's N comparisons. Removing an element from the scheduler runs in O(1) time, since it simply requires unlinking it from its current position in the list.
>
> With the heap, adding an entry into the scheduler runs in O(log_2 N) time. Removing an entry is also O(log_2 N).
>
> So, when switching to the heap implementation, you theoretically gain performance on additions, but take a hit on deletes. But, how much? See the "Testing Done" section to find out!
>
>
> Diffs
> -----
>
> /trunk/include/asterisk/heap.h PRE-CREATION
> /trunk/main/Makefile 176027
> /trunk/main/heap.c PRE-CREATION
> /trunk/main/sched.c 176027
> /trunk/tests/test_heap.c PRE-CREATION
> /trunk/tests/test_sched.c PRE-CREATION
>
> Diff: http://reviewboard.digium.com/r/160/diff
>
>
> Testing
> -------
>
> Since it may not be immediately clear what the benefits would be of this, testing has been done to measure the performance benefit for different values of N.
>
> Useful things to note:
>
> 1) An addition into the scheduler also requires a malloc(). A deletion requires a call to free().
>
> 2) Both operations require a mutex lock() and unlock().
>
> So, the testing done was to figure out:
>
> A) Do the benefits in add() outweigh the penalty in del() ?
>
> B) At what value of N do these changes make enough of an impact to be noticed compared to how much time it takes for the malloc(), free(), and mutex lock()/unlock() calls?
>
> The following tests were done on my primary development machine, which has an Intel Core 2 Duo @ 2.33 GHz. The results were gathered using the "sched test <N>" CLI command from the test_sched_module.
>
> Test results:
>
> Average running time for N calls to ast_sched_add():
>
> trunk team/russell/heap
>
> N=1 million 686.7 seconds 2.7 seconds
> N=100k 53.8 seconds .278 seconds
> N=10k 149.8 ms. 29.9 ms.
> N=5k 47.0 ms. 15.0 ms.
> N=1k 3.9 ms. 2.9 ms.
> N=500 1.6 ms. 1.5 ms.
> N=100 0.3 ms. 0.3 ms.
>
> Average running time for N calls to ast_sched_del():
>
> trunk team/russell/heap
>
> N=1 million 436.3 ms. 603.0 ms.
> N=100k 42.0 ms. 60.3 ms.
> N=10k 3.4 ms. 4.4 ms.
> N=5k 1.8 ms. 2.2 ms.
> N=1k 0.38 ms. 0.44 ms.
> N=500 0.19 ms. 0.21 ms.
> N=100 0.034 ms. 0.040 ms.
>
> Total average running time for N calls to ast_sched_add() and ast_sched_del():
>
> trunk team/russell/heap
>
> N=1 million 686.1 seconds 3.3 seconds
> N=100k 53.8 seconds .338 seconds
> N=10k 153.2 ms. 34.2 ms.
> N=5k 48.8 ms. 17.2 ms.
> N=1k 4.28 ms. 3.34 ms.
> N=500 1.79 ms. 1.71 ms.
> N=100 0.334 ms. 0.340 ms.
>
> Average running time of team/russell/heap as a percentage of the average running time from trunk:
>
> Percentage of time
> N=1 million 0.4 %
> N=100k 0.6 %
> N=10k 22.3 %
> N=5k 35.2 %
> N=1k 78.0 %
> N=500 95.5 %
> N=100 101.8 %
>
> Analysis of results:
>
> It appears that in this test environment, somewhere near N=100 and below, the code in trunk will perform better. However, at that point, we're talking about very small fractions of a second. As N gets larger, the code in team/russell/heap clearly starts to perform much better.
>
> With how the scheduler API is used today, a module such as chan_sip or chan_iax2 could easily have a scheduler context with N in the thousands. Every registration will result in a scheduler entry. Also, every active channel will result in at least one, if not more scheduler entries.
>
> Given these results, I do believe that the changes have proven to be beneficial overall.
>
>
> Thanks,
>
> Russell
>
>
More information about the asterisk-dev
mailing list