[Asterisk-Users] Dual T400P, SMP, performance issues
Matthias Granberry
matthias at utdallas.edu
Thu Jun 26 10:27:12 MST 2003
Also, make sure that the kernel and all the modules are compiled with
the same gcc version. I had to manually change some hardcoded zaptel
makefile targets to use gcc-2.95 instead of ${CC} or gcc hardcoded in.
The entire asterisk build system is somewhat weak, but it seems to
work if you tweak it all just right. It's obvious what things the
developers are interested in, though. The boring parts are all
half-done, and the interesting parts are all fairly high-quality.
Matthias
The Traveller <traveler at xs4all.nl> writes:
> Hi Alex,
>
> The problem is most likely to occur with high volumes of call-setups and
> disconnects. This could be reproduced by putting 2 of your T-1 ports
> back to back and then using the auto-dialer to generate a large amount of
> very short calls between the ports.
>
> I'm currently attempting to figure out what's causing the problem,
> by trying different kernels with different options. Trying a different
> version of GCC is a good idea. Didn't think of that yet.
>
> So far, I had limited success. The panics popped up in all the kernels
> I tested with, although some things, like some other hardware / drivers, seem
> to make them more likely to appear. See the other thread I started about
> this problem.
>
> Grtz,
>
> Oliver
>
> On Tue, Jun 24, 2003 at 19:10:08 -0500, Alex Zarubin wrote:
>
>> Mark & Oliver,
>>
>> It is too early to say, but the picture is different now. Our dual CPU,
>> dual T400P box is up for 4 days, under the load of 10 - 100 simultaneous
>> PRI -> SIP calls. We installed 2.4.21 #2 SMP (it was still freezing after
>> that) and, what I think made the difference, recompiled
>> zaptel-libpri-asterisk
>> with gcc 3.3.
>>
>> The problem, on the way, was that asterisk wouldn't start after that. It was
>> crashing while loading mp3 and lpc10 codecs. We put 'noload' for these two
>> into modules.conf - temporary solution, of course.
>>
>> There are problems, still, with multiple connections at the same time.
>> Windows
>> to the box get frozen for a sec, D-channel error messages. The following
>> messages are dumped into /var/log/messages. What do you think?
>>
>> Jun 24 18:23:25 mspgate03 kernel:
>> Jun 24 18:23:25 mspgate03 kernel: wait_on_irq, CPU 1:
>> Jun 24 18:23:25 mspgate03 kernel: irq: 1 [ 0 0 1 0 ]
>> Jun 24 18:23:25 mspgate03 kernel: bh: 0 [ 0 0 0 0 ]
>> Jun 24 18:23:25 mspgate03 kernel: Stack dumps:
>> Jun 24 18:23:25 mspgate03 kernel: CPU 0:02000000 0000036f 00e14603
>> 18020000 03000010 00006647 008e0200 48030000
>> Jun 24 18:23:25 mspgate03 kernel: 00000078 001ffa02 5b490300
>> 06000000 000001c7 074e0308 00001afe 01c74d03
>> Jun 24 18:23:25 mspgate03 kernel: 23020000 d7080000 e1000001
>> 09000000 000001d7 f5030001 04000023 09300207
>> Jun 24 18:23:25 mspgate03 kernel: Call Trace: [<f89bd281>]
>> [<f89bb132>] [<f89bbb47>] [<f89bd281>] [<f89bd281>]
>> Jun 24 18:23:25 mspgate03 kernel: [<f89bb132>] [<f89bd281>]
>> [<f89bd281>] [<f89bb132>] [<f89bbb47>] [<f89e7737>]
>> Jun 24 18:23:25 mspgate03 kernel: [<f89aa80a>] [<f89aa80a>]
>> [<c01feee4>] [<f89e7737>] [<c01f4eae>] [<c010a98e>]
>> Jun 24 18:23:25 mspgate03 kernel: [<c020d122>] [<c010abe3>]
>> [<c020d122>] [<c020d550>] [<c010a98e>] [<c020d550>]
>> Jun 24 18:23:25 mspgate03 kernel: [<c010abfe>] [<c01f0919>]
>> [<c01f0919>] [<c022a1ef>] [<c022a1ef>] [<c022a5f5>]
>> Jun 24 18:23:25 mspgate03 kernel: [<f89bd281>] [<f89bd281>]
>> [<f89bd281>] [<f89bb132>] [<f89bd510>] [<f89e7737>]
>> Jun 24 18:23:25 mspgate03 kernel: [<c022a5f5>] [<c01f0ffd>]
>> [<c01f112e>] [<c01f53c2>] [<c012005b>] [<c010abfe>]
>> Jun 24 18:23:25 mspgate03 kernel: [<c015147a>] [<c01509dc>]
>> [<c0147460>] [<c0147fb8>] [<f89e7737>] [<f89e7737>]
>> Jun 24 18:23:25 mspgate03 kernel: [<c01f0998>] [<c01f0fac>]
>> [<c01f112e>] [<c01f53c2>] [<c0117fce>] [<c0117ef0>]
>> Jun 24 18:23:25 mspgate03 kernel: [<c0144a64>] [<c01246db>]
>> [<c0109023>]
>> Jun 24 18:23:25 mspgate03 kernel:
>> Jun 24 18:23:25 mspgate03 kernel: CPU 2:00000000 00000000 00000000
>> 00000000 00000000 00000000 00000000 00000000
>> Jun 24 18:23:25 mspgate03 kernel: 00000000 00000000 00000000
>> 00000000 00000000 00000000 00000000 00000000
>> Jun 24 18:23:25 mspgate03 kernel: 00000000 00000000 00000000
>> 00000000 00000000 00000000 00000000 00000000
>> Jun 24 18:23:25 mspgate03 kernel: Call Trace:
>> Jun 24 18:23:25 mspgate03 kernel:
>> Jun 24 18:23:25 mspgate03 kernel: CPU 3:00000070 cce30002 0cd80000
>> 08fa0000 69530000 656c706d 6c616e41 73697379
>> Jun 24 18:23:25 mspgate03 kernel: 0009a700 46534c00 65746e69
>> 6c6f7072 32657461 6e655f61 0a810063 69530000
>> Jun 24 18:23:25 mspgate03 kernel: 656c706d 65746e49 6c6f7072
>> 4c657461 39004653 5300000b 6c706d69 66736c65
>> Jun 24 18:23:25 mspgate03 kernel: Call Trace:
>> Jun 24 18:23:25 mspgate03 kernel:
>> Jun 24 18:23:25 mspgate03 kernel: CPU 1:e14d5eac c025c896 00000001
>> 00000001 ffffffff 00000001 c010a7c2 c025c8ab
>> Jun 24 18:23:25 mspgate03 kernel: 00000000 f2d92124 e14d5f00
>> c0191104 00000500 00001805 000000bf 00008a01
>> Jun 24 18:23:25 mspgate03 kernel: 7f1c0300 01000415 1a131100
>> 170f1200 00000000 e14d4000 00000000 00000000
>> Jun 24 18:23:25 mspgate03 kernel: Call Trace: [<c010a7c2>]
>> [<c0191104>] [<c01913d4>] [<c018e1e2>] [<c014c2c7>]
>> Jun 24 18:23:25 mspgate03 kernel: [<c0109023>]
>> Jun 24 18:23:25 mspgate03 kernel:
>>
>> Thank you.
>> Alex Zarubin
>>
>> -----Original Message-----
>> From: The Traveller [mailto:traveler at xs4all.nl]
>> Sent: Tuesday, June 17, 2003 3:10 PM
>> To: asterisk-users at lists.digium.com
>> Subject: Re: [Asterisk-Users] Dual T400P, SMP, performance issues
>>
>>
>> On Tue, Jun 17, 2003 at 20:54:39 +0200, The Traveller wrote:
>> >
>> > BTW: As I reported in my previous mail to the list, I've now installed
>> kernel
>> > 2.4.21-rc2 with ACPI-patch on the box with the E100P. I've been trying
>> > very hard to reproduce a freeze with this kernel, but haven't succeeded
>> yet.
>> [...]
>>
>> Ok, it crashed again, so that wasn't it either. What I did to trigger
>> it was using the auto-dialer to loop as many calls to app_datetime out
>> and then back over the same E-1 as it would take, queueing the calls
>> to "/var/spool/asterisk/outgoing/" 14 at a time. It froze at the first
>> attempt. The "good" news is that it produced a visible kernel-panic.
>> This time. My guess is that you only don't see it if the console
>> screensaver has already come on while it happens.
>>
>> It read something like "Unable to handle kernel paging request" and
>> happened in the swapper-task. As usual, it dumped a lot of numbers on the
>> screen, which I didn't want to write down.
>>
>> Mark: If you want my help in debugging this, I'll hook it up to a
>> serial console, trigger the crash and provide you with the exact
>> panic, together with the ksyms and modules-info to trace it.
>>
>>
>>
>> Grtz,
>>
>> Oliver
>> _______________________________________________
>> Asterisk-Users mailing list
>> Asterisk-Users at lists.digium.com
>> http://lists.digium.com/mailman/listinfo/asterisk-users
> _______________________________________________
> Asterisk-Users mailing list
> Asterisk-Users at lists.digium.com
> http://lists.digium.com/mailman/listinfo/asterisk-users
--
Matthias Granberry
matthias at utdallas.edu
(469) 371-0596
More information about the asterisk-users
mailing list