[asterisk-dev] [Code Review] Improve scheduler performance under high load

Russell Bryant russell at digium.com
Mon Feb 16 15:08:07 CST 2009



> On 2009-02-16 11:16:16, Kevin Fleming wrote:
> > I did not end up commenting on all the trailing whitespace issues, since you can see them as easily as I can here.

fixed!


> On 2009-02-16 11:16:16, Kevin Fleming wrote:
> > /trunk/include/asterisk/heap.h, line 66
> > <http://reviewboard.digium.com/r/160/diff/5/?file=2718#file2718line66>
> >
> >     It would be helpful to include a \code block here to demonstrate the usage of this field.

done!


> On 2009-02-16 11:16:16, Kevin Fleming wrote:
> > /trunk/include/asterisk/heap.h, line 128
> > <http://reviewboard.digium.com/r/160/diff/5/?file=2718#file2718line128>
> >
> >     use \b or \e for emphasis, so doxygen will highlight properly

Nice, I didn't know about those commands.  Thanks


> On 2009-02-16 11:16:16, Kevin Fleming wrote:
> > /trunk/include/asterisk/heap.h, line 190
> > <http://reviewboard.digium.com/r/160/diff/5/?file=2718#file2718line190>
> >
> >     is 'properly placed relative to its children', since the creator of the heap may have provided a different comparison function

Good point, fixed


> On 2009-02-16 11:16:16, Kevin Fleming wrote:
> > /trunk/main/heap.c, line 65
> > <http://reviewboard.digium.com/r/160/diff/5/?file=2720#file2720line65>
> >
> >     Doesn't match the return type of the function.

Fixed!


> On 2009-02-16 11:16:16, Kevin Fleming wrote:
> > /trunk/main/heap.c, line 174
> > <http://reviewboard.digium.com/r/160/diff/5/?file=2720#file2720line174>
> >
> >     nasty whitespace creeps in everywhere :-)

fixed


> On 2009-02-16 11:16:16, Kevin Fleming wrote:
> > /trunk/main/heap.c, line 215
> > <http://reviewboard.digium.com/r/160/diff/5/?file=2720#file2720line215>
> >
> >     whitespace

*poof*


> On 2009-02-16 11:16:16, Kevin Fleming wrote:
> > /trunk/main/heap.c, line 233
> > <http://reviewboard.digium.com/r/160/diff/5/?file=2720#file2720line233>
> >
> >     for consistency, use (h->cur_len)-- here, as you've used for ++ above

done


> On 2009-02-16 11:16:16, Kevin Fleming wrote:
> > /trunk/main/sched.c, lines 237-238
> > <http://reviewboard.digium.com/r/160/diff/5/?file=2721#file2721line237>
> >
> >     I'm pretty sure the compiler will optimize these away regardless, but I don't think you really need these temporary variables at all. '((struct sched *) a)->when' should be fine.

done!


> On 2009-02-16 11:16:16, Kevin Fleming wrote:
> > /trunk/main/sched.c, line 239
> > <http://reviewboard.digium.com/r/160/diff/5/?file=2721#file2721line239>
> >
> >     whitespace

shazam!


> On 2009-02-16 11:16:16, Kevin Fleming wrote:
> > /trunk/main/sched.c, lines 255-256
> > <http://reviewboard.digium.com/r/160/diff/5/?file=2721#file2721line255>
> >
> >     Now I am thinking that maybe ast_heap_create should actually be a macro, that does the offsetof() automatically, by accepting the _type_ of the entries to be into the heap and the name of the index field inside the entry. I can't think of a case where a user of this API would *not* use offsetof() in exactly this way.
> 
>  wrote:
>     Hmm, is there a way that this could be done and still allow for the index_offset to be an optional parameter? Aside from having two separate allocation functions/macros, I mean.
> 
>  wrote:
>     No, there isn't, but I think the value of having the entry type and field name as part of the API would offset the small loss of it being optional. I can't really see that many cases where it being optional would be all that useful.
> 
>  wrote:
>     It's nice because you don't have to dedicate space in your structure for some other part of the code to mess with. If you wanted to start storing a particular structure in a heap and also wanted to maintain ABI compatibility, it would not be possible without making the index_offset optional. For cases where all that's necessary are push/pop operations from a heap, then it seems odd to have to add a field to your structure that isn't actually going to be used for anything.
> 
>  wrote:
>     OK, I see how that makes sense. Given that, the API doc example of how to use offsetof() to do the right thing is enough. The only other option would be for the heap itself to contain these index fields, rather than making the stored object contain them, but that would take quite a few changes to implement.

API doc updated with an example


- Russell


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviewboard.digium.com/r/160/#review436
-----------------------------------------------------------


On 2009-02-16 15:07:59, Russell Bryant wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://reviewboard.digium.com/r/160/
> -----------------------------------------------------------
> 
> (Updated 2009-02-16 15:07:59)
> 
> 
> Review request for Asterisk Developers.
> 
> 
> Summary
> -------
> 
> This patch implements the binary max-heap data structure and makes use of it in the Asterisk scheduler.  There are also two new test modules which we were used for development and testing of this code.  The heap test module verifies that the heap is behaving as expected.  The scheduler test module times how long it takes to do adds and deletes from a scheduler context.
> 
> More information about the heap data structure can be found here: http://en.wikipedia.org/wiki/Heap_(data_structure)
> 
> In the rest of the description and testing discussion, let N be the number of entries in the scheduler.
> 
> The data structure that has been replaced in sched.c is a sorted doubly linked list (DLL).  
> 
> With the DLL, adding an entry into the scheduler runs in O(N) time.  If the existing entries have an even distribution over time, then the average case is that N/2 element comparisons will be required to find the proper place to insert the new entry.  In the worst case, it's N comparisons.  Removing an element from the scheduler runs in O(1) time, since it simply requires unlinking it from its current position in the list.
> 
> With the heap, adding an entry into the scheduler runs in O(log_2 N) time.  Removing an entry is also O(log_2 N).
> 
> So, when switching to the heap implementation, you theoretically gain performance on additions, but take a hit on deletes.  But, how much?  See the "Testing Done" section to find out!
> 
> 
> Diffs
> -----
> 
>   /trunk/include/asterisk/heap.h PRE-CREATION 
>   /trunk/main/Makefile 176215 
>   /trunk/main/heap.c PRE-CREATION 
>   /trunk/main/sched.c 176215 
>   /trunk/tests/test_heap.c PRE-CREATION 
>   /trunk/tests/test_sched.c PRE-CREATION 
> 
> Diff: http://reviewboard.digium.com/r/160/diff
> 
> 
> Testing
> -------
> 
> Since it may not be immediately clear what the benefits would be of this, testing has been done to measure the performance benefit for different values of N.
> 
> Useful things to note:
> 
> 1) An addition into the scheduler also requires a malloc().  A deletion requires a call to free().  
> 
> 2) Both operations require a mutex lock() and unlock().
> 
> So, the testing done was to figure out:
> 
> A) Do the benefits in add() outweigh the penalty in del() ?
> 
> B) At what value of N do these changes make enough of an impact to be noticed compared to how much time it takes for the malloc(), free(), and mutex lock()/unlock() calls?
> 
> The following tests were done on my primary development machine, which has an Intel Core 2 Duo @ 2.33 GHz.  The results were gathered using the "sched test <N>" CLI command from the test_sched_module.
> 
> Test results:
> 
> Average running time for N calls to ast_sched_add():
> 
>              trunk            team/russell/heap
> 
> N=1 million  686.7 seconds    2.7 seconds          
> N=100k       53.8 seconds     .278 seconds
> N=10k        149.8 ms.        29.9 ms.
> N=5k         47.0 ms.         15.0 ms.
> N=1k         3.9 ms.          2.9 ms.
> N=500        1.6 ms.          1.5 ms.
> N=100        0.3 ms.          0.3 ms.
> 
> Average running time for N calls to ast_sched_del():
> 
>              trunk            team/russell/heap
> 
> N=1 million  436.3 ms.        603.0 ms.
> N=100k       42.0 ms.         60.3 ms.
> N=10k        3.4 ms.          4.4 ms.
> N=5k         1.8 ms.          2.2 ms.
> N=1k         0.38 ms.         0.44 ms.
> N=500        0.19 ms.         0.21 ms.
> N=100        0.034 ms.        0.040 ms.
> 
> Total average running time for N calls to ast_sched_add() and ast_sched_del():
> 
>              trunk            team/russell/heap
> 
> N=1 million  686.1 seconds    3.3 seconds
> N=100k       53.8 seconds     .338 seconds
> N=10k        153.2 ms.        34.2 ms.
> N=5k         48.8 ms.         17.2 ms.
> N=1k         4.28 ms.         3.34 ms.
> N=500        1.79 ms.         1.71 ms.
> N=100        0.334 ms.        0.340 ms.
> 
> Average running time of team/russell/heap as a percentage of the average running time from trunk:
> 
>               Percentage of time
> N=1 million   0.4 %
> N=100k        0.6 %
> N=10k         22.3 %
> N=5k          35.2 %
> N=1k          78.0 %
> N=500         95.5 %
> N=100         101.8 %
> 
> Analysis of results:
> 
> It appears that in this test environment, somewhere near N=100 and below, the code in trunk will perform better.  However, at that point, we're talking about very small fractions of a second.  As N gets larger, the code in team/russell/heap clearly starts to perform much better.
> 
> With how the scheduler API is used today, a module such as chan_sip or chan_iax2 could easily have a scheduler context with N in the thousands.  Every registration will result in a scheduler entry.  Also, every active channel will result in at least one, if not more scheduler entries.
> 
> Given these results, I do believe that the changes have proven to be beneficial overall.
> 
> 
> Thanks,
> 
> Russell
> 
>




More information about the asterisk-dev mailing list