[Asterisk-Users] Centos 4.3 Issues
Greg Oliver
goliver at cistera.com
Mon May 22 09:45:45 MST 2006
On Mon, 2006-05-22 at 12:16 -0400, Greg Boehnlein wrote:
> Hello,
> I was wondering if anyone out there is successfully running
> Asterisk 1.2 svn w/ Centos 4.3. I had an experience over the last two
> weeks that has me scratching my head and muttering strange things in the
> wee hours of the morning. I am going to try and be as descriptive as my
> brain will allow right now, but if there is something that I do not cover,
> please do not hesitate to ask and I'll be happy to answer.
>
> For the last 2 years, I have been running a mixture of Tao Linux
> and Centos (both RHEL derivatives) on our production boxes. Asterisk has
> run flawlessly on all installations. Last week, I updated one of our
> gateway boxes from Centos 4.2 (under which it ran for 6 months without
> issue) to the new 4.3 code. Almost immediately, we began to experience
> problems. Asterisk would core w/ the following:
>
> #0 0x004878ab in test_err () from
> /usr/lib/asterisk/modules/codec_g729a.so
>
> The segfaults would happen under very light loads, in some cases
> with just a single call. Kevin was able to log in to the box, and put a
> debugging version of codec_g729 on the box. He determined that the problem
> was that the values that were being returned in that routine were
> incorrect. I.E. something in the system was returning a non-zero value
> when multiplying a number by "0". Barring any other explanations, we
> assumed that there was a hardware issue somewhere, either in the memory,
> or the FPU on the CPU.
> So, we replaced the box w/ a brand new Dual-Core system running a
> Dual-Core Pentium D 920. We loaded the 32 bit version of Centos 4.3 onto
> the box and proceeded to start testing. BAM.. same problem.. the backtrace
> showed the failure in the same routine.
> We scratched our heads, and after many hours of trying various
> things (backing off the kernel to 2.6.9-22) and even moving to the new
> development kernel 2.6.9-34.19 (from the testing tree) we could do nothing
> to solve the issue.
> Mind you, this is the exact same behavior on two different
> hardware platforms running the exact same distribution. We even loaded up
> a third box and could reproduce the behavior on it as well. Three
> different boxes, one common distribution.
>
> As a test, we installed Fedora Core 5 x86_64 on the new Dual Core
> box and ran extensive tests overnight, simulating 96 channels doing G729
> to Ulaw transcoding. The box ran completely stable. No hiccups.
>
> So, this morning, we put it back into the cluster, and it's now
> taking about 200 concurrent calls, doing an insane amount of transcoding
> and it is working just fine. Before, it would have cored in the first
> couple of minutes.
>
> I'm scratching my head here, because I generally have had excellent
> experiences with Centos. However, I have NO idea what might be the issue
> here. Could it be the kernel? (We tried three different ones!). Could it
> be the libc? Maybe it is the compiler?
>
> In any case, if anyone is having success with Centos 4.3 (32 bit), please
> speak up. I'd like to get to the bottom of it. I generally do not like to
> run Fedora on production equipment as it is generally bleeding edge. In
> this case, FC5 is running 2.6.16 something..
>
Have you tried compiling statically on CentOS 4.2 and running on 4.3?
I am assuming you have made sure the dist is up to date with patches.
We do not use 729, so I cannot try it out for you, but we do use CentOS.
Is it only w/ SVN, or all releases of *?
-Greg
More information about the asterisk-users
mailing list