[asterisk-biz] Asterisk Performance Results

Trixter aka Bret McDanel trixter at 0xdecafbad.com
Sat Nov 17 11:26:34 CST 2007


On 11/17/07, Matthew Rubenstein <email at mattruby.com> wrote:
>
>         Ultimately a standard test suite like the one so helpfully
> published in
> this thread would benchmark the baseline config, then the transcoding
> config, as this one did. But for each of the combinations of the various
> codecs. Then a benchmark adding each of various options, like
> conferencing, recording, etc. That grid would be repeated on each of
> directly comparable HW configs, like a single CPU with single core at
> x-GHz, multiple of those CPUs, the same benchmarks repeated for
> single/multiple multicore CPUs, each at increasing GHz.
>
>         That 3D stack of benchmark data could be crunched to find the
> linear
> (or other simple formula) ranges for scaling Asterisk capacity for



it will not scale linearly though, it gets slower when you start to get a
higher contention rate of threads wanting to run vs total number of
available cpus.  Then its also not just a function of clockspeed, one thing
the woodcrest has in its favour, which the 5140 is, is that it can do more
in the same clockspeed than other xeons.  So instruction optimization, cache
(and is the cache warm or not?), branch prediction failures (generally a
compiler issue), as well as everything else influence how much it can do in
a given clock cycle.

Then you have other issues, ALOC being high or low, low ALOCs cause more
call setup/tear down requests, and are more common with one type of traffic
than another.  Once you measure all the things that make up the base then
you can start to do a per feature request, but again a conference with 2
people will perform differently than one with 20, which will perform
differently based on which of the 3 major conferencing modules you use.

But yes, if the base is done in a good way, with a wide range of criteria
you can see what the cost is for a base, then by taking fewer samples for
new features you can try to extrapolate based on that base.  For example, if
you did 1,2,4,8,16,32,64 cps tests as a base and rated performance, then did
1,8,64 cps tests for a specific module, lets say conferencing where you
tossed 2 in there one time, 8 another and maybe 30 for a final, you could
start to see the relative impacts.

But then again its never that simple, but it could give you a much better
feel, although its a much more involved test.

Since the criteria is public on what was done, if others were to perform
similar testing then slow spots could be identified as well as some better
quality metrics.  Although I have commented on what I would like to see in
addition, such as 1 or more programs monitoring rtp for quality issues
(possibly via port replication, with multiple tools you are more likely to
have a way to average the results since a missed packet may be the switch
not replicating, the program not catching it or whatever).


>         That's the kind of benchmarking that I'd expect Digium to do. They
> probably have already done at least a limited subset, but haven't


it would certainly help them identify where the code is a little slow, so
they can work on that.  Profiling is good, but sometimes you also need to
try to use it in more real world situations to identify what exactly is
going on that is causing problems.  1 call can work for profiling, but 1000
can show lock contention and other issues that a single call would never
reveal.


-- 
Trixter http://www.0xdecafbad.com     Bret McDanel
Belfast +44 28 9099 6461        US +1 516 687 5200
http://www.trxtel.com the phone company that pays you!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.digium.com/pipermail/asterisk-biz/attachments/20071117/7ba00050/attachment-0001.htm 


More information about the asterisk-biz mailing list