[asterisk-biz] Asterisk Performance Results

Trixter aka Bret McDanel trixter at 0xdecafbad.com
Sat Nov 17 11:12:12 CST 2007


On 11/17/07, Talking Voice <talkingvoice at gmail.com> wrote:
>
> Have you seen the stats on rtpproxy http://www.rtpproxy.org it seems
> to be far more impressive than those of Asterisk B2BUA.



it also does less work.  It does not relay media, and those numbers arent
impressive when you consider there are other open source projects that in a
B2BUA without media can do 30,000 on a box less than what the asterisk test
platform was.  But since this is the asterisk-biz list I wont comment on
them.  I only bring that up to say that if you arent shovenling the media
bits around there is less work to do and performance increases.  If you are
a reseller of traffic you can cut your bandwidth costs by going direct to
the carriers and not touching the media, which makes it easier to lower your
rates and be more competitive, since most end users dont care about that
although other ITSPs would most likely.

150-200 cps is a good number though, much higher than than the 8.5, and
while that may be high unless you have a lot of campaign customers, it does
give these stats a run for their money, since that is generally more
expensive than just shoveling bits.

1-8 cps also makes it harder to extrapolate, if you look at the progression
in the pdf, if you go from 1-2 cps you double the concurrent calls (more or
less) which keeps progressing off the base, until you reach 8.5.  Why was
8.5 the stopping point and not 9 anyway, it seems like an odd count to stop
at, but anyway.  If you were to just do linear math, which never works since
management overhead in this situation will grow  exponentially not linearly
as you add more channels (context switches, servicing NIC irqs, etc) you
cannot extrapolate that if you get 3000 legs at 8 cps that at 16 you will
have 6000.  I have a guess and would love it if someone could confirm this
that if you were to test at 16, 32 and 64 cps that the total number of
channels would drop dramatically, and you would see calls fail to start.

Now there are other factors, to be fair I will comment on sippy since you
brought it up :)  They say "per server" but do not define the server in
question, in the asterisk report the server was specifically defined
(although it would have been nicer to have more details on which versions of
libs were installed, sysctl settings, etc).  A server could be for example a
sun with 16 cores or whatever.  That should get more performance than a dual
woodcrest 5140 box.

Then we get into more details of the woodcrest cpu, something I neglected to
comment on last night when trying to say that they are different from other
xeon chips.  The woodcrest (dual core) and clovertown (quad core) are
branded as a xeon, but there is a huge performance difference between these
and other xeon chips, for one thing they can do 4 instructions per clock
tick (per intel, dunno how this works exactly since its CISC and not RISC so
instructions can have variable numbers of ticks required to complete).  This
is much faster than previous xeons.  They also require a 1333MHz FSB instead
of 1066 or slower.

Then you can get into the linux kernel, 2.6.23 is supposed to have a new
scheduler that is faster, the low latency patches for faster io should be in
place as well by that version, and you can even get into interrupt
coalescence where instead of a 1:1 irq per packet if the card and driver
support it you can buffer data until a timeout has occured or a certain
amount of data has accumulated, which adds a tiny amount of latency
(generally well below 2ms) but increases over all performance.  All of these
have been available as independant patches, but will see their way into
2.6.23 or so the rumor mill goes.  I dont know if centos has applied any of
these patches in what they shipped.

And finally we come to GCC, which can make a dramatic difference in the way
it optimizes, or rather doesnt.  If you have many .c files it will not
optimize between them nearly as much as if its all one big .c file.  If you
took all the asterisk "core" code and made one huge .c file and built it
with that you would probably see some performance improvements - although
code manageability would be diminished (this could be a build script for
example, and likely would require some cleanup to deal with includes and
other things).  There is a rumor that sqlite for example did this and saw a
10% performance increase, yeah its a 68,000 line .c file or something but
its faster :)  I have not confirmed that, its just a rumor that I heard
somewhere, and I hear a lot of stuff it cant all be true.  I dont know how
many files rtpproxy is, or how they are structured, depending on that it
could optimize better based solely on that.

I do not know if asterisk will compile with anything but GCC, I havent
tried, but if it does other compilers may optimize the code better.  The
march settings can also cause a bit of a performance change, since you can
get more optimized binaries for your specific platform.

So when doing tests I think that its important to talk about how the program
was built and what hardware it was running on, otherwise you are comparing
apples to quasars.  The url that you provided does not give as much detail
as the asterisk test, which I think is still missing some details that could
dramatically impact performance, so you can never be sure that you are
comparing 'head to head'.


-- 
Trixter http://www.0xdecafbad.com     Bret McDanel
Belfast +44 28 9099 6461        US +1 516 687 5200
http://www.trxtel.com the phone company that pays you!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.digium.com/pipermail/asterisk-biz/attachments/20071117/0827aef9/attachment.htm 


More information about the asterisk-biz mailing list