<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META NAME="Generator" CONTENT="MS Exchange Server version 5.5.2654.45">
<TITLE>RE: [Asterisk-Users] Dual T400P, SMP, performance issues</TITLE>
</HEAD>
<BODY>
<P><FONT SIZE=2>Mark,</FONT>
</P>
<P><FONT SIZE=2>As far as pings - we have cases when we could ping the box on both</FONT>
<BR><FONT SIZE=2>interfaces and there are cases when we could not (we tried 3-4 sets of</FONT>
<BR><FONT SIZE=2>NICs and drivers). All telnets, X, ssh etc. are definitely dead.</FONT>
<BR><FONT SIZE=2>No coredumps (asterisk was started with -g option), no kernel panics.</FONT>
<BR><FONT SIZE=2>Black console, Alt-SysRq combinations don't work.</FONT>
<BR><FONT SIZE=2>Pretty much no options but rebooting the box.</FONT>
</P>
<P><FONT SIZE=2>As far as SMP and single T400P - we'll try and report the results</FONT>
<BR><FONT SIZE=2>but the idea was to go with as high density as possible ...</FONT>
</P>
<P><FONT SIZE=2>What do you think of using hyperthreading - should we enable or disable it</FONT>
<BR><FONT SIZE=2>for the box running asterisk?</FONT>
</P>
<P><FONT SIZE=2>What about -DCONFIG_ZAPTEL_WATCHDOG ? Can it help and how to use it?</FONT>
</P>
<P><FONT SIZE=2>Thank you.</FONT>
<BR><FONT SIZE=2>Alex Zarubin</FONT>
</P>
<P><FONT SIZE=2>-----Original Message-----</FONT>
<BR><FONT SIZE=2>From: Mark Spencer [<A HREF="mailto:markster@digium.com">mailto:markster@digium.com</A>]</FONT>
<BR><FONT SIZE=2>Sent: Saturday, June 14, 2003 10:23 AM</FONT>
<BR><FONT SIZE=2>To: 'asterisk-users@lists.digium.com'</FONT>
<BR><FONT SIZE=2>Subject: RE: [Asterisk-Users] Dual T400P, SMP, performance issues</FONT>
</P>
<BR>
<P><FONT SIZE=2>When you say "stops responding" do you mean no more pings, telnet dead,</FONT>
<BR><FONT SIZE=2>etc? Or do you mean asterisk stops responding? Is there a segfault or</FONT>
<BR><FONT SIZE=2>kernel panic, or any other failure diagnostic?</FONT>
</P>
<P><FONT SIZE=2>Mark</FONT>
</P>
<P><FONT SIZE=2>On Thu, 12 Jun 2003, Alex Zarubin wrote:</FONT>
</P>
<P><FONT SIZE=2>> Zaptel was compiled with -D__SMP__</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>> We've installed irqbalance and the picture improved a lot</FONT>
<BR><FONT SIZE=2>> (thanks to Jared Smith). Do you still see problems in our /proc/interrupts?</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>> The big issue for us now is that after 24+ hours of the test load PRI->SIP</FONT>
<BR><FONT SIZE=2>> our Dell PE2650, dual 2.6 GHz Xeon, 2 Gb RAM, 2 T400P, 2.4.20-18.7smp #1 SMP</FONT>
<BR><FONT SIZE=2>> stops responding to anything.</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>> So the questions are:</FONT>
<BR><FONT SIZE=2>> - are there known issues with PE2650 and ways to fix them?</FONT>
<BR><FONT SIZE=2>> - can someone recommend the 'stable' 2.4 SMP kernel for this</FONT>
<BR><FONT SIZE=2>> kind of load?</FONT>
<BR><FONT SIZE=2>> - any expertise in this area will be appreciated</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>> CPU0 CPU1 CPU2 CPU3</FONT>
<BR><FONT SIZE=2>> 0: 230710 30030 50050 0 IO-APIC-edge timer</FONT>
<BR><FONT SIZE=2>> 1: 5 0 0 233 IO-APIC-edge keyboard</FONT>
<BR><FONT SIZE=2>> 2: 0 0 0 0 XT-PIC cascade</FONT>
<BR><FONT SIZE=2>> 5: 0 0 0 0 IO-APIC-level usb-ohci</FONT>
<BR><FONT SIZE=2>> 8: 1 0 0 0 IO-APIC-edge rtc</FONT>
<BR><FONT SIZE=2>> 14: 27 0 2 0 IO-APIC-edge ide0</FONT>
<BR><FONT SIZE=2>> 20: 2085442 400221 0 230232 IO-APIC-level tor2</FONT>
<BR><FONT SIZE=2>> 24: 293848 1841658 10010 570568 IO-APIC-level tor2</FONT>
<BR><FONT SIZE=2>> 28: 5 25643 0 0 IO-APIC-level eth0</FONT>
<BR><FONT SIZE=2>> 29: 5 0 5165040 0 IO-APIC-level eth1</FONT>
<BR><FONT SIZE=2>> 30: 43720 35467 1291 3296 IO-APIC-level aacraid</FONT>
<BR><FONT SIZE=2>> NMI: 0 0 0 0</FONT>
<BR><FONT SIZE=2>> LOC: 310618 310616 310616 310616</FONT>
<BR><FONT SIZE=2>> ERR: 0</FONT>
<BR><FONT SIZE=2>> MIS: 0</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>> Thank you.</FONT>
<BR><FONT SIZE=2>> Alex Zarubin</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>> -----Original Message-----</FONT>
<BR><FONT SIZE=2>> From: Martin Pycko [<A HREF="mailto:martinp@digium.com">mailto:martinp@digium.com</A>]</FONT>
<BR><FONT SIZE=2>> Sent: Tuesday, June 10, 2003 9:48 AM</FONT>
<BR><FONT SIZE=2>> To: 'asterisk-users@lists.digium.com'</FONT>
<BR><FONT SIZE=2>> Subject: Re: [Asterisk-Users] Dual T400P, SMP, performance issues</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>> Are you sure that you compiled zaptel for __SMP__ ?</FONT>
<BR><FONT SIZE=2>> Edit your zaptel/Makefile.</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>> 0: 75283844 75241320 75286285 75247088 IO-APIC-edge timer</FONT>
<BR><FONT SIZE=2>> 1: 1 0 1 1 IO-APIC-edge keyboard</FONT>
<BR><FONT SIZE=2>> 2: 0 0 0 0 XT-PIC cascade</FONT>
<BR><FONT SIZE=2>> 3: 0 0 0 0 IO-APIC-level usb-ohci</FONT>
<BR><FONT SIZE=2>> 8: 1 0 0 0 IO-APIC-edge rtc</FONT>
<BR><FONT SIZE=2>> 15: 1 0 0 1 IO-APIC-edge ide1</FONT>
<BR><FONT SIZE=2>> 16: 22134870 22120997 22135905 22122829 IO-APIC-level eth0</FONT>
<BR><FONT SIZE=2>> 25: 4670 4548 4614 4518 IO-APIC-level tor2</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>> All the four CPU's should have IRQ's like in the example above.</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>> Martin</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>> On Mon, 9 Jun 2003, Alex Zarubin wrote:</FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>> > Hi,</FONT>
<BR><FONT SIZE=2>> ></FONT>
<BR><FONT SIZE=2>> > We are trying to validate Asterisk as a media gateway PRI <-> SIP with two</FONT>
<BR><FONT SIZE=2>> > T400P (8 T1s) per box. The first</FONT>
<BR><FONT SIZE=2>> > experience with BOX1 (Compaq, 2.53 GHz, 1 Gb RAM) and just one T400P was</FONT>
<BR><FONT SIZE=2>> > encouraging - on the load</FONT>
<BR><FONT SIZE=2>> > test with 3 T1s worth of calls we had on average 75% idle CPU.</FONT>
<BR><FONT SIZE=2>> ></FONT>
<BR><FONT SIZE=2>> > Not so with BOX2 (Dell, single 2.6 GHz Xeon, 1 Gb RAM, 2 T400P) and BOX3</FONT>
<BR><FONT SIZE=2>> > (Dell, dual 2.6 GHz Xeon,</FONT>
<BR><FONT SIZE=2>> > 2 Gb RAM, 2 T400P, asterisk/zaptel is built with SMP support).</FONT>
<BR><FONT SIZE=2>> ></FONT>
<BR><FONT SIZE=2>> > On the similar load test (as with the BOX1) BOX2 was showing 0% idle CPU</FONT>
<BR><FONT SIZE=2>> 70%</FONT>
<BR><FONT SIZE=2>> > of the time. Just 3 T1s</FONT>
<BR><FONT SIZE=2>> > out of 8.</FONT>
<BR><FONT SIZE=2>> ></FONT>
<BR><FONT SIZE=2>> > On the load test with just 2 T1s BOX3 was very close to 0% idle on CPU0,</FONT>
<BR><FONT SIZE=2>> > CPU1 was at 95% idle.</FONT>
<BR><FONT SIZE=2>> > The process ksoftirqd_CPU0 was close to the top of the 'top', with</FONT>
<BR><FONT SIZE=2>> > /proc/interrupts showing tor2 related</FONT>
<BR><FONT SIZE=2>> > numbers growing very fast. We had 2 T1s plugged into the first T400P</FONT>
<BR><FONT SIZE=2>> board,</FONT>
<BR><FONT SIZE=2>> > with nothing going into the second,</FONT>
<BR><FONT SIZE=2>> > but the number of interrupts for the both boards was growing at the same</FONT>
<BR><FONT SIZE=2>> > pace. Here are the interrupts</FONT>
<BR><FONT SIZE=2>> > (after the box reboot, so they are not that big as they were) - do they</FONT>
<BR><FONT SIZE=2>> look</FONT>
<BR><FONT SIZE=2>> > OK?</FONT>
<BR><FONT SIZE=2>> ></FONT>
<BR><FONT SIZE=2>> ></FONT>
<BR><FONT SIZE=2>> > CPU0 CPU1 CPU2 CPU3</FONT>
<BR><FONT SIZE=2>> > 0: 122556 0 0 0 IO-APIC-edge timer</FONT>
<BR><FONT SIZE=2>> > 1: 4 0 0 0 IO-APIC-edge keyboard</FONT>
<BR><FONT SIZE=2>> > 2: 0 0 0 0 XT-PIC cascade</FONT>
<BR><FONT SIZE=2>> > 5: 0 0 0 0 IO-APIC-level usb-ohci</FONT>
<BR><FONT SIZE=2>> > 8: 1 0 0 0 IO-APIC-edge rtc</FONT>
<BR><FONT SIZE=2>> > 12: 20 0 0 0 IO-APIC-edge PS/2</FONT>
<BR><FONT SIZE=2>> Mouse</FONT>
<BR><FONT SIZE=2>> > 14: 23 0 2 0 IO-APIC-edge ide0</FONT>
<BR><FONT SIZE=2>> > 20: 516930 0 0 0 IO-APIC-level tor2</FONT>
<BR><FONT SIZE=2>> > 24: 516524 0 0 0 IO-APIC-level tor2</FONT>
<BR><FONT SIZE=2>> > 28: 10600 0 0 0 IO-APIC-level eth0</FONT>
<BR><FONT SIZE=2>> > 29: 4837 0 0 0 IO-APIC-level eth1</FONT>
<BR><FONT SIZE=2>> > 30: 24831 0 0 0 IO-APIC-level aacraid</FONT>
<BR><FONT SIZE=2>> > NMI: 0 0 0 0</FONT>
<BR><FONT SIZE=2>> > LOC: 122430 122429 122429 122428</FONT>
<BR><FONT SIZE=2>> > ERR: 0</FONT>
<BR><FONT SIZE=2>> > MIS: 0</FONT>
<BR><FONT SIZE=2>> ></FONT>
<BR><FONT SIZE=2>> > Not sure what went wrong. Any suggestions on how to work with 2 T400P in a</FONT>
<BR><FONT SIZE=2>> > box (without hurting performance)</FONT>
<BR><FONT SIZE=2>> > and how to get advantage of SMP for Asterisk would be appreciated.</FONT>
<BR><FONT SIZE=2>> ></FONT>
<BR><FONT SIZE=2>> > Any known Linux kernel related issues (2.4.20-13.7smp #1 SMP for BOX3 )?</FONT>
<BR><FONT SIZE=2>> ></FONT>
<BR><FONT SIZE=2>> > Thank you.</FONT>
<BR><FONT SIZE=2>> ></FONT>
<BR><FONT SIZE=2>> > Alex Zarubin</FONT>
<BR><FONT SIZE=2>> ></FONT>
<BR><FONT SIZE=2>> ></FONT>
<BR><FONT SIZE=2>> ></FONT>
<BR><FONT SIZE=2>></FONT>
<BR><FONT SIZE=2>> _______________________________________________</FONT>
<BR><FONT SIZE=2>> Asterisk-Users mailing list</FONT>
<BR><FONT SIZE=2>> Asterisk-Users@lists.digium.com</FONT>
<BR><FONT SIZE=2>> <A HREF="http://lists.digium.com/mailman/listinfo/asterisk-users" TARGET="_blank">http://lists.digium.com/mailman/listinfo/asterisk-users</A></FONT>
<BR><FONT SIZE=2>></FONT>
</P>
<P><FONT SIZE=2>_______________________________________________</FONT>
<BR><FONT SIZE=2>Asterisk-Users mailing list</FONT>
<BR><FONT SIZE=2>Asterisk-Users@lists.digium.com</FONT>
<BR><FONT SIZE=2><A HREF="http://lists.digium.com/mailman/listinfo/asterisk-users" TARGET="_blank">http://lists.digium.com/mailman/listinfo/asterisk-users</A></FONT>
</P>
</BODY>
</HTML>