[Asterisk-Users] Dual T400P, SMP, performance issues
The Traveller
traveler at xs4all.nl
Mon Jun 16 12:45:54 MST 2003
Yo,
I've seen very similar Zaptel-related freezes on a wide variety of
mainboards (SMP as well as non-SMP), with X100P's as well as with an E100P.
At some point, almost always at the moment a call through one of those cards
connects or disconnects, the machine completely stops responding and needs
a reset to come back to life. A very nice way to trigger it with the E100P
seems to be to put around 10-20 channels of it into a meetme-conference and
then issue the "stop now"-command on the Asterisk-console. A high volume
of connects / disconnects seems to trigger the freezes. I'm still
investigating the issue and am going to try different kernels and
some custom kernel-patches.
One of my boxes (dual PIII-750, Intel L440GX+-board) with an X100P and
a TDM40P in it hasn't frozen since I installed kernel 2.4.21-rc2 with
the ACPI-patch (http://sourceforge.net/projects/acpi/). I'll probably try
that on the box with the E100P first. Be sure enable "Power Management
support" in your kernel-config, disable APM, enable ACPI and check all
ACPI-options, except for "CPU Enumeration Only". Note that this ACPI-
patch also handles IRQ-routing and might help in cases where the BIOS assigns
the same IRQ to some devices (or, as was the case for me, none at all).
Grtz,
Oliver
On Mon, Jun 16, 2003 at 13:03:20 -0500, Alex Zarubin wrote:
> Mark,
>
> As far as pings - we have cases when we could ping the box on both
> interfaces and there are cases when we could not (we tried 3-4 sets of
> NICs and drivers). All telnets, X, ssh etc. are definitely dead.
> No coredumps (asterisk was started with -g option), no kernel panics.
> Black console, Alt-SysRq combinations don't work.
> Pretty much no options but rebooting the box.
>
> As far as SMP and single T400P - we'll try and report the results
> but the idea was to go with as high density as possible ...
>
> What do you think of using hyperthreading - should we enable or disable it
> for the box running asterisk?
>
> What about -DCONFIG_ZAPTEL_WATCHDOG ? Can it help and how to use it?
>
> Thank you.
> Alex Zarubin
>
> -----Original Message-----
> From: Mark Spencer [mailto:markster at digium.com]
> Sent: Saturday, June 14, 2003 10:23 AM
> To: 'asterisk-users at lists.digium.com'
> Subject: RE: [Asterisk-Users] Dual T400P, SMP, performance issues
>
>
> When you say "stops responding" do you mean no more pings, telnet dead,
> etc? Or do you mean asterisk stops responding? Is there a segfault or
> kernel panic, or any other failure diagnostic?
>
> Mark
>
> On Thu, 12 Jun 2003, Alex Zarubin wrote:
>
> > Zaptel was compiled with -D__SMP__
> >
> > We've installed irqbalance and the picture improved a lot
> > (thanks to Jared Smith). Do you still see problems in our
> /proc/interrupts?
> >
> > The big issue for us now is that after 24+ hours of the test load PRI->SIP
> > our Dell PE2650, dual 2.6 GHz Xeon, 2 Gb RAM, 2 T400P, 2.4.20-18.7smp #1
> SMP
> > stops responding to anything.
> >
> > So the questions are:
> > - are there known issues with PE2650 and ways to fix them?
> > - can someone recommend the 'stable' 2.4 SMP kernel for this
> > kind of load?
> > - any expertise in this area will be appreciated
> >
> > CPU0 CPU1 CPU2 CPU3
> > 0: 230710 30030 50050 0 IO-APIC-edge timer
> > 1: 5 0 0 233 IO-APIC-edge keyboard
> > 2: 0 0 0 0 XT-PIC cascade
> > 5: 0 0 0 0 IO-APIC-level usb-ohci
> > 8: 1 0 0 0 IO-APIC-edge rtc
> > 14: 27 0 2 0 IO-APIC-edge ide0
> > 20: 2085442 400221 0 230232 IO-APIC-level tor2
> > 24: 293848 1841658 10010 570568 IO-APIC-level tor2
> > 28: 5 25643 0 0 IO-APIC-level eth0
> > 29: 5 0 5165040 0 IO-APIC-level eth1
> > 30: 43720 35467 1291 3296 IO-APIC-level aacraid
> > NMI: 0 0 0 0
> > LOC: 310618 310616 310616 310616
> > ERR: 0
> > MIS: 0
> >
> > Thank you.
> > Alex Zarubin
> >
> > -----Original Message-----
> > From: Martin Pycko [mailto:martinp at digium.com]
> > Sent: Tuesday, June 10, 2003 9:48 AM
> > To: 'asterisk-users at lists.digium.com'
> > Subject: Re: [Asterisk-Users] Dual T400P, SMP, performance issues
> >
> >
> > Are you sure that you compiled zaptel for __SMP__ ?
> > Edit your zaptel/Makefile.
> >
> > 0: 75283844 75241320 75286285 75247088 IO-APIC-edge timer
> > 1: 1 0 1 1 IO-APIC-edge keyboard
> > 2: 0 0 0 0 XT-PIC cascade
> > 3: 0 0 0 0 IO-APIC-level usb-ohci
> > 8: 1 0 0 0 IO-APIC-edge rtc
> > 15: 1 0 0 1 IO-APIC-edge ide1
> > 16: 22134870 22120997 22135905 22122829 IO-APIC-level eth0
> > 25: 4670 4548 4614 4518 IO-APIC-level tor2
> >
> > All the four CPU's should have IRQ's like in the example above.
> >
> > Martin
> >
> > On Mon, 9 Jun 2003, Alex Zarubin wrote:
> >
> > > Hi,
> > >
> > > We are trying to validate Asterisk as a media gateway PRI <-> SIP with
> two
> > > T400P (8 T1s) per box. The first
> > > experience with BOX1 (Compaq, 2.53 GHz, 1 Gb RAM) and just one T400P was
> > > encouraging - on the load
> > > test with 3 T1s worth of calls we had on average 75% idle CPU.
> > >
> > > Not so with BOX2 (Dell, single 2.6 GHz Xeon, 1 Gb RAM, 2 T400P) and BOX3
> > > (Dell, dual 2.6 GHz Xeon,
> > > 2 Gb RAM, 2 T400P, asterisk/zaptel is built with SMP support).
> > >
> > > On the similar load test (as with the BOX1) BOX2 was showing 0% idle CPU
> > 70%
> > > of the time. Just 3 T1s
> > > out of 8.
> > >
> > > On the load test with just 2 T1s BOX3 was very close to 0% idle on CPU0,
> > > CPU1 was at 95% idle.
> > > The process ksoftirqd_CPU0 was close to the top of the 'top', with
> > > /proc/interrupts showing tor2 related
> > > numbers growing very fast. We had 2 T1s plugged into the first T400P
> > board,
> > > with nothing going into the second,
> > > but the number of interrupts for the both boards was growing at the same
> > > pace. Here are the interrupts
> > > (after the box reboot, so they are not that big as they were) - do they
> > look
> > > OK?
> > >
> > >
> > > CPU0 CPU1 CPU2 CPU3
> > > 0: 122556 0 0 0 IO-APIC-edge timer
> > > 1: 4 0 0 0 IO-APIC-edge
> keyboard
> > > 2: 0 0 0 0 XT-PIC
> cascade
> > > 5: 0 0 0 0 IO-APIC-level
> usb-ohci
> > > 8: 1 0 0 0 IO-APIC-edge rtc
> > > 12: 20 0 0 0 IO-APIC-edge PS/2
> > Mouse
> > > 14: 23 0 2 0 IO-APIC-edge ide0
> > > 20: 516930 0 0 0 IO-APIC-level tor2
> > > 24: 516524 0 0 0 IO-APIC-level tor2
> > > 28: 10600 0 0 0 IO-APIC-level eth0
> > > 29: 4837 0 0 0 IO-APIC-level eth1
> > > 30: 24831 0 0 0 IO-APIC-level
> aacraid
> > > NMI: 0 0 0 0
> > > LOC: 122430 122429 122429 122428
> > > ERR: 0
> > > MIS: 0
> > >
> > > Not sure what went wrong. Any suggestions on how to work with 2 T400P in
> a
> > > box (without hurting performance)
> > > and how to get advantage of SMP for Asterisk would be appreciated.
> > >
> > > Any known Linux kernel related issues (2.4.20-13.7smp #1 SMP for BOX3 )?
> > >
> > > Thank you.
> > >
> > > Alex Zarubin
> > >
> > >
> > >
> >
> > _______________________________________________
> > Asterisk-Users mailing list
> > Asterisk-Users at lists.digium.com
> > http://lists.digium.com/mailman/listinfo/asterisk-users
> >
>
> _______________________________________________
> Asterisk-Users mailing list
> Asterisk-Users at lists.digium.com
> http://lists.digium.com/mailman/listinfo/asterisk-users
More information about the asterisk-users
mailing list