[asterisk-users] Asterisk 1.4 reliability problems

Ben Willcox ben.willcox at british-gymnastics.org
Tue Mar 18 10:40:53 CDT 2008


Hi All,

Thanks for all the replies. Here are my responses to the responses:

On Tue, 2008-03-18 at 06:13 -0400, Al Baker wrote:
> Curious, you mention "a number of problems" that have "gone on for months"
> Question:  Have you reported ANY or ALL of them to DIGIUM and if so
>                   what has been their response on each of these problems ?

We have been working very closely with the reseller that supplied us
with the system, and although we have made progress over this time and
they have given us a lot of technical support, I now feel that it will
be quicker to progress the current issues independently. I don't know if
the issues were escalated as far as Digium though.

Tzafrir Cohen wrote:
> The symptoms you mention suggest some sort of deadlock. Please enable
> debug and the full log. Maybe this will provide some hints. But please
> check that the full log is rotated in /etc/logrotate.d/asterisk .
> 
> Can you reproduce this situation? e.g.: by extensive usage of the
> manager interface? If so, it might help for testing.

I will enable full debug logging. I suspect that we could reproduce the
original problem with the manager interface by stress testing it with
multiple connections, but I'm not sure if this is the same problem that
we are currently experiencing.
I also want to avoid causing problems on our production system at the
moment, as it is rather 'delicate' as far as the users are concerned at
the moment.

Steve Totaro wrote:
> Why not try a different OS such as CentOS for now?  That would be my
> next step.

I have considered this, to at least to establish whether it is a Debian
specific problem, either with the asterisk packages themselves, or some
other configuration or package issue. I am umming and ahhing between
this and Gordon's suggestion below:

Gordon Henderson wrote:
> Personally, I'd go back to Debian, but stick to stable (Etch) and
> then 
> compile and install a custom kernel tailored exactly to your
> hardware, 
> then compile and install your own asterisk from source.

I'm thinking that this may be the way I should go, then I will have the
freedom to install any version of asterisk that I need, whilst also
keeping my favourite distro.

Doug Lytle wrote:
> Two things,
> 
> 1.)  On your queue setup, avoid using AgenCallbackLogin, it's known
> to 
> cause deadlocked channels.
> 2.)  Restart the Asterisk service once a week.  I do this via a CRON
> job 
> at 3am on Sundays.

We're actually not using Agents on our queues, just SIP channels, so
hopefully this is not the problem. We simulate 'agents' logging in and
out by pausing and unpausing queue members.
I am now going to add a cron job to restart asterisk daily, in the hope
that until the problem is resolved properly, at least it will help
relieve some of the pain by making it stable for a full 24hrs at a time.

Matt Florell wrote:
> I would suggest upgrading to at least 1.4.18. I was able to run it for
> about 2 weeks and almost one million calls before I could get it to
> crash, and the 1.4.19RC2 seems to fix even more of the locking issues
> as well. I know a lot of these problems still existed under 1.4.17.

A million calls sounds good, but 2 weeks, not so good. It's a bit
disappointing to me that crashing /ever/ is acceptable, I had always had
the understanding that asterisk was supposed to be rock-solid. I suppose
it's some consolation that its not just me that has problems!

Thanks for all the input. I think short term I will restart asterisk
daily, then the action plan is to revert back to Debian Etch, and then
install asterisk 1.4.18 from source, and hopefully this will improve
things.

Thanks,
Ben



More information about the asterisk-users mailing list