[asterisk-users] Scaling Asterisk: Dual-Core CPUs not yielding
gains at high call volumes - Low volume benchmarks
Matthew J. Roth
mroth at imminc.com
Fri May 25 16:13:59 MST 2007
List users,
This post contains the benchmarks for Asterisk at low call volumes on
similar single and dual-core servers. I'd appreciate it greatly if you
took the time to read and comment on it.
Thank you,
Matthew Roth
InterMedia Marketing Solutions
Software Engineer and Systems Developer
Conclusions
-----------
I'm presenting the conclusions first, because they are the most
important part of the benchmarking. If you like details and numbers,
scroll down.
I've drawn three conclusions from this set of benchmarks.
1. At low call volumes, the dual-core server outperforms the
single-core server by the expected margin.
2. Calls bridged to an agent are more CPU intensive than calls
listening to audio via the Playback() application or calls in queue.
This is expected, because they involve more SIP channels and more work
is done on the RTP frames (bridging, recording, etc.).
3. For all call types, the majority of the CPU time is spent in the
kernel (servicing system calls, etc.). I've observed this to be true at
all call volumes on our production server, with the ratio sometimes in
the range of 20 to 1. This may suggest that the popular perception that
Asterisk doesn't scale well because of its extensive use of linked lists
doesn't tell the whole story.
So far there are no surprises, but over the next week or so I'll be
collecting data that I expect to reveal that at high call volumes
(200-300 concurrent calls) the idle percentage on both machines starts
to approach the same value. In the end, my goal is to break through
(or, at the least, understand) this scaling issue, so I welcome all
forms of critique. It's quite possible that the problem lies in my
setup or that I'm missing something obvious, but I suspect it is deeper
than that.
Benchmarking Methodology
------------------------
I collected each type of data as follows.
- Active channel and call counts: 'asterisk -rx "show channels"' and
'asterisk -rx "sip show channels"'
- Thread counts: 'ps -eLf' and 'ps axms'
- Idle time values: 'sar 30 1'
- Average CPU utilization per call: (startIdle - endIdle) / numCalls
The servers were rebooted between tests.
Call Types
----------
I tested the following three call types.
- Incoming SIP to the Playback() application
- 1 active SIP channel per call
- From the originating Asterisk server to the Playback() application
- Incoming SIP to the Queue() application - In queue
- 1 active SIP channel per call
- From the originating Asterisk server to the Queue() application
- Incoming SIP to the Queue() application - Bridged to an agent
- 2 active SIP channels per call
- From the originating Asterisk server to the Queue() application
- Bridged from the Queue() application to the agent
All calls were pure VOIP (SIP/RTP) and originated from another Asterisk
server. Calls that were bridged to agents terminated at SIP hardphones
(Snom 320s) and were recorded to a RAM disk via the Monitor()
application. All calls were in the uLaw codec and all audio files
(including the call recordings, the native MOH, and the periodic queue
announcements which played approximately every 60 seconds) were in the
PCM file format. There was no transcoding, protocol bridging, or TDM
hardware involved on the servers being benchmarked.
A Note on Asterisk and Threads
------------------------------
On both systems, a freshly started Asterisk process consisted of 10
threads. Some events, such as performing an 'asterisk -rx reload'
triggered the creation of a new persistent thread. The benchmarking
revealed that in general, the Asterisk process will consist of 10-15
persistent background threads plus exactly 1 additional thread per
active call.
This means that at even modest call volumes, Asterisk will utilize all
of the CPUs in most modern PC-based servers.
Server Profiles
---------------
The servers I performed the benchmarking on are described below. Note
that the CPUs support hyperthreading, but it is disabled. This is
reflected in the CPU count, which is the number of physical processors
available to the OS.
Short Name: DC
Manufacturer: Dell Computer Corporation
Product Name: PowerEdge 6850
Processors: Four Dual-Core Intel Xeon MP CPUs at 3.00GHz
CPU Count: 8
FSB Speed: 800 MHz
OS: Fedora Core 3 - 2.6.13-ztdummy SMP x86_64 Kernel
Asterisk Ver: ABE-B.1-3
Short Name: SC
Manufacturer: Dell Computer Corporation
Product Name: PowerEdge 6850
Processors: Four Single-Core Intel Xeon MP CPUs at 3.16GHz
CPU Count: 4
FSB Speed: 667 MHz
OS: Fedora Core 3 - 2.6.13-ztdummy SMP x86_64 Kernel
Asterisk Ver: ABE-B.1-3
The kernel is a vanilla 2.6.13 kernel with enhanced realtime clock
support and a timer frequency of 1000 HZ (earning it the EXTRAVERSION of
'-ztdummy'). I am aware that the 2.6.17 kernel introduced multi-core
scheduler support, but it exhibited negligible gains in the kernel build
benchmark. Nonetheless, I am open to any tips regarding kernel versions
and configuration options.
At the software level, the servers are identical. They are both running
the same version of Asterisk Business Edition, and the Fedora Core 3
installation was performed from the bare metal using the same install
document and a local source for the update RPMs.
The Numbers
-----------
DC - Incoming SIP to the Playback() application
===============================================
calls %user %system %iowait %idle
0 0.00 0.01 0.01 99.98
1 0.02 0.04 0.00 99.94
2 0.02 0.06 0.00 99.92
3 0.03 0.11 0.00 99.86
4 0.04 0.13 0.00 99.83
5 0.05 0.16 0.00 99.80
6 0.05 0.20 0.00 99.75
7 0.07 0.24 0.00 99.70
8 0.07 0.25 0.00 99.67
9 0.08 0.27 0.00 99.65
10 0.09 0.33 0.00 99.58
Average CPU utilization per call: 0.040% (~960 MHz)
SC - Incoming SIP to the Playback() application
===============================================
calls %user %system %iowait %idle
0 0.01 0.02 0.00 99.98
1 0.02 0.10 0.00 99.88
2 0.03 0.17 0.00 99.80
3 0.06 0.21 0.00 99.73
4 0.08 0.28 0.00 99.63
5 0.10 0.34 0.01 99.55
6 0.11 0.48 0.00 99.41
7 0.14 0.49 0.00 99.37
8 0.16 0.57 0.00 99.28
9 0.17 0.63 0.01 99.19
10 0.18 0.75 0.00 99.07
Average CPU utilization per call: 0.091% (~1152 MHz)
DC - Incoming SIP to the Queue() application - In queue
=======================================================
calls %user %system %iowait %idle
0 0.00 0.01 0.00 99.99
1 0.01 0.03 0.00 99.96
2 0.01 0.05 0.00 99.94
3 0.01 0.08 0.00 99.91
4 0.02 0.10 0.00 99.88
5 0.03 0.12 0.00 99.84
6 0.04 0.16 0.00 99.80
7 0.03 0.17 0.00 99.80
8 0.04 0.20 0.00 99.76
9 0.03 0.22 0.00 99.75
10 0.05 0.27 0.00 99.68
Average CPU utilization per call: 0.031% (~744 MHz)
SC - Incoming SIP to the Queue() application - In queue
=======================================================
calls %user %system %iowait %idle
0 0.02 0.02 0.00 99.96
1 0.03 0.07 0.00 99.91
2 0.03 0.13 0.00 99.83
3 0.04 0.18 0.00 99.78
4 0.05 0.23 0.00 99.72
5 0.06 0.27 0.00 99.67
6 0.07 0.33 0.00 99.60
7 0.09 0.38 0.00 99.53
8 0.09 0.40 0.00 99.51
9 0.11 0.46 0.01 99.43
10 0.11 0.48 0.00 99.41
Average CPU utilization per call: 0.055% (~697 MHz)
DC - Incoming SIP to the Queue() application - Bridged to an agent
==================================================================
calls %user %system %iowait %idle
0 0.00 0.01 0.00 99.99
1 0.01 0.06 0.00 99.93
2 0.02 0.14 0.00 99.84
3 0.03 0.16 0.00 99.81
Average CPU utilization per call: 0.060% (~1440 MHz)
SC - Incoming SIP to the Queue() application - Bridged to an agent
==================================================================
calls %user %system %iowait %idle
0 0.01 0.02 0.00 99.98
1 0.02 0.16 0.00 99.82
2 0.04 0.28 0.00 99.68
3 0.07 0.36 0.00 99.57
Average CPU utilization per call: 0.137% (~1735 MHz)
More information about the asterisk-users
mailing list