[asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding gains at high call volumes - Low volume benchmarks

Matthew J. Roth mroth at imminc.com
Fri May 25 16:13:59 MST 2007


List users,

This post contains the benchmarks for Asterisk at low call volumes on 
similar single and dual-core servers.  I'd appreciate it greatly if you 
took the time to read and comment on it.

Thank you,

Matthew Roth
InterMedia Marketing Solutions
Software Engineer and Systems Developer


Conclusions
-----------
I'm presenting the conclusions first, because they are the most 
important part of the benchmarking.  If you like details and numbers, 
scroll down.

I've drawn three conclusions from this set of benchmarks.

  1. At low call volumes, the dual-core server outperforms the 
single-core server by the expected margin.
  2. Calls bridged to an agent are more CPU intensive than calls 
listening to audio via the Playback() application or calls in queue.  
This is expected, because they involve more SIP channels and more work 
is done on the RTP frames (bridging, recording, etc.).
  3. For all call types, the majority of the CPU time is spent in the 
kernel (servicing system calls, etc.).  I've observed this to be true at 
all call volumes on our production server, with the ratio sometimes in 
the range of 20 to 1.  This may suggest that the popular perception that 
Asterisk doesn't scale well because of its extensive use of linked lists 
doesn't tell the whole story.

So far there are no surprises, but over the next week or so I'll be 
collecting data that I expect to reveal that at high call volumes 
(200-300 concurrent calls) the idle percentage on both machines starts 
to approach the same value.  In the end, my goal is to break through 
(or, at the least, understand) this scaling issue, so I welcome all 
forms of critique.  It's quite possible that the problem lies in my 
setup or that I'm missing something obvious, but I suspect it is deeper 
than that.

Benchmarking Methodology
------------------------
I collected each type of data as follows.

 - Active channel and call counts: 'asterisk -rx "show channels"' and 
'asterisk -rx "sip show channels"'
 - Thread counts: 'ps -eLf' and 'ps axms'
 - Idle time values: 'sar 30 1'
 - Average CPU utilization per call: (startIdle - endIdle) / numCalls
 
The servers were rebooted between tests.

Call Types
----------
I tested the following three call types.

 - Incoming SIP to the Playback() application
   - 1 active SIP channel per call
     - From the originating Asterisk server to the Playback() application

 - Incoming SIP to the Queue() application - In queue
   - 1 active SIP channel per call
     - From the originating Asterisk server to the Queue() application

 - Incoming SIP to the Queue() application - Bridged to an agent
   - 2 active SIP channels per call
     - From the originating Asterisk server to the Queue() application
     - Bridged from the Queue() application to the agent
 
All calls were pure VOIP (SIP/RTP) and originated from another Asterisk 
server.  Calls that were bridged to agents terminated at SIP hardphones 
(Snom 320s) and were recorded to a RAM disk via the Monitor() 
application.  All calls were in the uLaw codec and all audio files 
(including the call recordings, the native MOH, and the periodic queue 
announcements which played approximately every 60 seconds) were in the 
PCM file format.  There was no transcoding, protocol bridging, or TDM 
hardware involved on the servers being benchmarked.  

A Note on Asterisk and Threads
------------------------------
On both systems, a freshly started Asterisk process consisted of 10 
threads.  Some events, such as performing an 'asterisk -rx reload' 
triggered the creation of a new persistent thread.  The benchmarking 
revealed that in general, the Asterisk process will consist of 10-15 
persistent background threads plus exactly 1 additional thread per 
active call.

This means that at even modest call volumes, Asterisk will utilize all 
of the CPUs in most modern PC-based servers.

Server Profiles
---------------
The servers I performed the benchmarking on are described below.  Note 
that the CPUs support hyperthreading, but it is disabled.  This is 
reflected in the CPU count, which is the number of physical processors 
available to the OS.

  Short Name: DC
Manufacturer: Dell Computer Corporation
Product Name: PowerEdge 6850
  Processors: Four Dual-Core Intel Xeon MP CPUs at 3.00GHz
   CPU Count: 8
   FSB Speed: 800 MHz
          OS: Fedora Core 3 - 2.6.13-ztdummy SMP x86_64 Kernel
Asterisk Ver: ABE-B.1-3

  Short Name: SC
Manufacturer: Dell Computer Corporation
Product Name: PowerEdge 6850
  Processors: Four Single-Core Intel Xeon MP CPUs at 3.16GHz
   CPU Count: 4
   FSB Speed: 667 MHz
          OS: Fedora Core 3 - 2.6.13-ztdummy SMP x86_64 Kernel
Asterisk Ver: ABE-B.1-3

The kernel is a vanilla 2.6.13 kernel with enhanced realtime clock 
support and a timer frequency of 1000 HZ (earning it the EXTRAVERSION of 
'-ztdummy').  I am aware that the 2.6.17 kernel introduced multi-core 
scheduler support, but it exhibited negligible gains in the kernel build 
benchmark.  Nonetheless, I am open to any tips regarding kernel versions 
and configuration options.

At the software level, the servers are identical.  They are both running 
the same version of Asterisk Business Edition, and the Fedora Core 3 
installation was performed from the bare metal using the same install 
document and a local source for the update RPMs.

The Numbers
-----------

DC - Incoming SIP to the Playback() application
===============================================
calls   %user   %system   %iowait     %idle
    0    0.00      0.01      0.01     99.98
    1    0.02      0.04      0.00     99.94
    2    0.02      0.06      0.00     99.92
    3    0.03      0.11      0.00     99.86
    4    0.04      0.13      0.00     99.83
    5    0.05      0.16      0.00     99.80
    6    0.05      0.20      0.00     99.75
    7    0.07      0.24      0.00     99.70
    8    0.07      0.25      0.00     99.67
    9    0.08      0.27      0.00     99.65
   10    0.09      0.33      0.00     99.58

Average CPU utilization per call: 0.040% (~960 MHz)

SC - Incoming SIP to the Playback() application
===============================================
calls   %user   %system   %iowait     %idle
    0    0.01      0.02      0.00     99.98
    1    0.02      0.10      0.00     99.88
    2    0.03      0.17      0.00     99.80
    3    0.06      0.21      0.00     99.73
    4    0.08      0.28      0.00     99.63
    5    0.10      0.34      0.01     99.55
    6    0.11      0.48      0.00     99.41
    7    0.14      0.49      0.00     99.37
    8    0.16      0.57      0.00     99.28
    9    0.17      0.63      0.01     99.19
   10    0.18      0.75      0.00     99.07

Average CPU utilization per call: 0.091% (~1152 MHz)

DC - Incoming SIP to the Queue() application - In queue
=======================================================
calls   %user   %system   %iowait     %idle
    0    0.00      0.01      0.00     99.99
    1    0.01      0.03      0.00     99.96
    2    0.01      0.05      0.00     99.94
    3    0.01      0.08      0.00     99.91
    4    0.02      0.10      0.00     99.88
    5    0.03      0.12      0.00     99.84
    6    0.04      0.16      0.00     99.80
    7    0.03      0.17      0.00     99.80
    8    0.04      0.20      0.00     99.76
    9    0.03      0.22      0.00     99.75
   10    0.05      0.27      0.00     99.68

Average CPU utilization per call: 0.031% (~744 MHz)

SC - Incoming SIP to the Queue() application - In queue
=======================================================
calls   %user   %system   %iowait     %idle
    0    0.02      0.02      0.00     99.96
    1    0.03      0.07      0.00     99.91
    2    0.03      0.13      0.00     99.83
    3    0.04      0.18      0.00     99.78
    4    0.05      0.23      0.00     99.72
    5    0.06      0.27      0.00     99.67
    6    0.07      0.33      0.00     99.60
    7    0.09      0.38      0.00     99.53
    8    0.09      0.40      0.00     99.51
    9    0.11      0.46      0.01     99.43
   10    0.11      0.48      0.00     99.41

Average CPU utilization per call: 0.055% (~697 MHz)

DC - Incoming SIP to the Queue() application - Bridged to an agent
==================================================================
calls   %user   %system   %iowait     %idle
    0    0.00      0.01      0.00     99.99
    1    0.01      0.06      0.00     99.93
    2    0.02      0.14      0.00     99.84
    3    0.03      0.16      0.00     99.81
    
Average CPU utilization per call: 0.060% (~1440 MHz)
   
SC - Incoming SIP to the Queue() application - Bridged to an agent
==================================================================
calls   %user   %system   %iowait     %idle
    0    0.01      0.02      0.00     99.98
    1    0.02      0.16      0.00     99.82
    2    0.04      0.28      0.00     99.68
    3    0.07      0.36      0.00     99.57

Average CPU utilization per call: 0.137% (~1735 MHz)



More information about the asterisk-users mailing list