[Asterisk-Users] Centos 4.3 Issues

Greg Oliver goliver at cistera.com
Mon May 22 09:45:45 MST 2006


On Mon, 2006-05-22 at 12:16 -0400, Greg Boehnlein wrote:
> Hello,
> 	I was wondering if anyone out there is successfully running 
> Asterisk 1.2 svn w/ Centos 4.3. I had an experience over the last two 
> weeks that has me scratching my head and muttering strange things in the 
> wee hours of the morning. I am going to try and be as descriptive as my 
> brain will allow right now, but if there is something that I do not cover, 
> please do not hesitate to ask and I'll be happy to answer.
> 
> 	For the last 2 years, I have been running a mixture of Tao Linux 
> and Centos (both RHEL derivatives) on our production boxes. Asterisk has 
> run flawlessly on all installations. Last week, I updated one of our 
> gateway boxes from Centos 4.2 (under which it ran for 6 months without 
> issue) to the new 4.3 code. Almost immediately, we began to experience 
> problems. Asterisk would core w/ the following:
> 
> #0  0x004878ab in test_err () from 
> /usr/lib/asterisk/modules/codec_g729a.so
> 
> 	The segfaults would happen under very light loads, in some cases 
> with just a single call. Kevin was able to log in to the box, and put a 
> debugging version of codec_g729 on the box. He determined that the problem 
> was that the values that were being returned in that routine were 
> incorrect. I.E. something in the system was returning a non-zero value 
> when multiplying a number by "0". Barring any other explanations, we 
> assumed that there was a hardware issue somewhere, either in the memory, 
> or the FPU on the CPU.
> 	So, we replaced the box w/ a brand new Dual-Core system running a 
> Dual-Core Pentium D 920. We loaded the 32 bit version of Centos 4.3 onto 
> the box and proceeded to start testing. BAM.. same problem.. the backtrace 
> showed the failure in the same routine.
> 	We scratched our heads, and after many hours of trying various 
> things (backing off the kernel to 2.6.9-22) and even moving to the new 
> development kernel 2.6.9-34.19 (from the testing tree) we could do nothing 
> to solve the issue.
> 	Mind you, this is the exact same behavior on two different 
> hardware platforms running the exact same distribution. We even loaded up 
> a third box and could reproduce the behavior on it as well. Three 
> different boxes, one common distribution.
> 
> 	As a test, we installed Fedora Core 5 x86_64 on the new Dual Core 
> box and ran extensive tests overnight, simulating 96 channels doing G729 
> to Ulaw transcoding. The box ran completely stable. No hiccups.
> 
> 	So, this morning, we put it back into the cluster, and it's now 
> taking about 200 concurrent calls, doing an insane amount of transcoding 
> and it is working just fine. Before, it would have cored in the first 
> couple of minutes.
> 
> I'm scratching my head here, because I generally have had excellent 
> experiences with Centos. However, I have NO idea what might be the issue 
> here. Could it be the kernel? (We tried three different ones!). Could it 
> be the libc? Maybe it is the compiler?
> 
> In any case, if anyone is having success with Centos 4.3 (32 bit), please 
> speak up. I'd like to get to the bottom of it. I generally do not like to 
> run Fedora on production equipment as it is generally bleeding edge. In 
> this case, FC5 is running 2.6.16 something..
> 

Have you tried compiling statically on CentOS 4.2 and running on 4.3?

I am assuming you have made sure the dist is up to date with patches.
We do not use 729, so I cannot try it out for you, but we do use CentOS.
Is it only w/ SVN, or all releases of *?

-Greg




More information about the asterisk-users mailing list