[Asterisk-Dev] Profiling results: Asterisk processing 5000 concurrent IAX registrations

steve at daviesfam.org steve at daviesfam.org
Mon Oct 31 02:17:23 MST 2005


Hi,

For any that are interested, I've started doing some profiling using 
oprofile.

Here's a first result.  The test is a P4 3.0 HT box, handling 5000 
concurrent incoming IAX2 registrations.  That's about 50 regreq's per 
second.

The box has 2 Sirrix quad-BRI boards, a TE405P and a TDM13B.  The various 
basic-rates and primary rates are looped and up but there are no active 
calls.

I'm using realtime to a mysql on another system.

Overall the box is using about 25% userspace CPU.  The incoming regreq's 
are 90kbits/sec on the ethernet, outgoing traffic 65kbits/sec.

I've got 157kb/s going out to the database, 192kb/s coming back.

I ran oprofile with "--separate=kernel", which also shows functions in 
libraries and the kernel that run "on behalf" of Asterisk.

Here's the top lines in the report:

CPU: P4 / Xeon with 2 hyper-threads, speed 2994.02 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000
samples  %        image name               symbol name
1361009  24.2598  /lib/libc-2.3.5.so       __GI___strcasecmp
379296    6.7609  /usr/lib/asterisk/modules/chan_iax2.so key
368921    6.5760  /usr/sbin/asterisk       ast_sched_add_variable
334993    5.9712  /lib/libpthread-0.10.so  __pthread_unlock
328170    5.8496  /lib/libpthread-0.10.so  __pthread_lock
238212    4.2461  /lib/libpthread-0.10.so  __pthread_internal_tsd_get
185520    3.3069  /lib/modules/2.6.11-gentoo-r3/misc/wct4xxp.ko t4_receiveprep
121226    2.1608  /lib/libpthread-0.10.so  __GI___pthread_mutex_unlock
110476    1.9692  /lib/modules/2.6.11-gentoo-r3/misc/wctdm.ko wctdm_interrupt
100797    1.7967  /lib/modules/2.6.11-gentoo-r3/misc/zaptel.ko zt_receive
99389     1.7716  /usr/src/linux-2.6.11-gentoo-r3/vmlinux _spin_lock_irqsave
94178     1.6787  /lib/modules/2.6.11-gentoo-r3/misc/zaptel.ko zt_transmit
81347     1.4500  /usr/src/linux-2.6.11-gentoo-r3/vmlinux memcpy
74321     1.3248  /lib/libpthread-0.10.so  __i686.get_pc_thunk.bx
71473     1.2740  /lib/libpthread-0.10.so  __GI___pthread_mutex_lock
56162     1.0011  /usr/sbin/asterisk       ast_sched_del
51718     0.9219  /usr/lib/asterisk/modules/chan_iax2.so reload
51291     0.9143  /usr/src/linux-2.6.11-gentoo-r3/vmlinux flush_tlb_others
49764     0.8870  /lib/modules/2.6.11-gentoo-r3/misc/wct4xxp.ko __t4_check_sigbits
49316     0.8791  /usr/src/linux-2.6.11-gentoo-r3/vmlinux _spin_unlock_irqrestore
39432     0.7029  /lib/modules/2.6.11-gentoo-r3/misc/wct4xxp.ko t4_transmitprep
38644     0.6888  /lib/libc-2.3.5.so       __i686.get_pc_thunk.bx
36921     0.6581  /usr/src/linux-2.6.11-gentoo-r3/vmlinux _spin_lock
35640     0.6353  /usr/src/linux-2.6.11-gentoo-r3/vmlinux system_call
33248     0.5926  /usr/src/linux-2.6.11-gentoo-r3/vmlinux mark_offset_tsc
27958     0.4983  /usr/src/linux-2.6.11-gentoo-r3/vmlinux schedule
26375     0.4701  /usr/src/linux-2.6.11-gentoo-r3/vmlinux sub_preempt_count
25864     0.4610  /lib/libc-2.3.5.so       _IO_vfprintf_internal
23440     0.4178  /usr/lib/asterisk/modules/chan_iax2.so anonymous symbol from section .plt
21236     0.3785  /usr/lib/libmysqlclient.so.12.0.0 (no symbols)
18389     0.3278  /usr/src/linux-2.6.11-gentoo-r3/vmlinux __d_lookup
17091     0.3046  /usr/src/linux-2.6.11-gentoo-r3/vmlinux link_path_walk
16173     0.2883  /usr/src/linux-2.6.11-gentoo-r3/vmlinux add_preempt_count
16111     0.2872  /usr/src/linux-2.6.11-gentoo-r3/vmlinux irq_entries_start
15577     0.2777  /usr/src/linux-2.6.11-gentoo-r3/vmlinux smp_processor_id
15099     0.2691  /usr/src/linux-2.6.11-gentoo-r3/vmlinux __copy_to_user_ll
12532     0.2234  /usr/src/linux-2.6.11-gentoo-r3/vmlinux __copy_from_user_ll

If there is interest, I'd like to do similar profiles for other workloads.
Email me if there are particular tests you'd like me to do.

In this example, looks like we'd probably score significantly by
down/upshifting peer names or whatever and then using a standard 
str(n)cmp rather than strcasecmp.

I haven't looked, but I'm sure the strcasecmp are to do with searching the 
rtcache'd peers trying to find the right one.  Obviously a better data 
structure for storing those peers would really score.

Steve



More information about the asterisk-dev mailing list