[asterisk-bugs] [Asterisk 0009788]: Deadlock problem with agents, queues and libpri (stop accepting incoming calls in PRI lines)

noreply at bugs.digium.com noreply at bugs.digium.com
Tue Sep 11 05:39:06 CDT 2007


The following issue has been REOPENED. 
====================================================================== 
http://bugs.digium.com/view.php?id=9788 
====================================================================== 
Reported By:                Ted Brown
Assigned To:                russell
====================================================================== 
Project:                    Asterisk
Issue ID:                   9788
Category:                   Addons/General
Reproducibility:            sometimes
Severity:                   crash
Priority:                   normal
Status:                     feedback
Asterisk Version:           1.4.10.1  
SVN Branch (only for SVN checkouts, not tarball releases): N/A  
SVN Revision (number only!):  
Disclaimer on File?:        No 
Request Review:              
====================================================================== 
Date Submitted:             05-23-2007 18:18 CDT
Last Modified:              09-11-2007 05:39 CDT
====================================================================== 
Summary:                    Deadlock problem with agents, queues and libpri
(stop accepting incoming calls in PRI lines)
Description: 
I have a Asterisk-based call center deployment with around 40 SIP users,
attending incoming calls from two PRI lines (2xE1) using agents and
queues.

The problem is that Asterisk stops accepting new incoming calls to the PRI
lines without reason, although there should be free channels to make room
for new incoming calls, but Asterisk thinks these channels are being used.
SIP calls can be placed without problems between internal users.

PRI lines shouldn't be the origin of the problem, as an old legacy PBX
works perfectly with the same lines, so the problem seems to be related
with agents or queues.

After the crash, performing an "zap show channels" shows that all channels
are busy, and calls seems that have been queued for a long time in
different queues (and they are not really there - users usually don't wait
90 minutes to be attended while listening to the music on hold).

There is no other services running on the server, CDR is being stored to 
disk and we are not using any kind of AGI's or reporting tools. Currently
the only solution is to reboot the machine, as rebooting Asterisk is not
enough. Using any command on the CLI results in no output at all.

The crash is not easily reproduceable, as it doesn't follow a clear
pattern. Asterisk just seem to get blocked when it manages around 30-40
calls in the queues. During last week, we had 2-3 crashed each day.

Based on users lists mails, it seems that other users have had a similar
problem within the same scenario, at least with 1.2.x. More precisely, we
have observed the same problem in bug ID 0006147, but it has been closed
without a clear answer.

Hardware and software specs:

 Platform: Suse Linux Enterprise Server 10
 Machine: IBM xSeries 226, 1 GB RAM, Intel CPU
 PRI card: Digium TE212 with echo cancellation module
 Asterisk version: 1.2.18

Follows a list of the most relevant messages before and after the crash:

DEBUG[28519] chan_sip.c: Stopping retransmission on
'NzNmZWM0ZDc0OTYyNWI5YWM2ZTBhZjY3NDM4N2RjNmQ.' of Response 12: Match Found 
(lots of messages like that)

DEBUG[28511] chan_zap.c: Ring requested on channel 0/13 already in use or
previously requested on span 1.  Attempting to renegotiating channel.

DEBUG[28511] chan_zap.c: Found empty available channel 0/9

DEBUG[29939] app_dial.c: Exiting with DIALSTATUS=CONGESTION.

I would very appreciate any help on this. I can provide backtrace if
needed.

Best regards,
====================================================================== 

---------------------------------------------------------------------- 
 Ted Brown - 09-11-07 05:39  
---------------------------------------------------------------------- 
Hi again,

I regret to inform that the problem has appeared again, using the
following versions:

 - Asterisk 1.4.11
 - Libpri 1.4.1
 - Zaptel 1.4.5.1

The scenario remains the same as before:

 - 2 PRI E1 links to the telco (using TE212P)
 - 40 SIP users in the LAN (using Eyebeam)
 - Using queues and agents to dispatch calls to SIP users

CPU and RAM levels are OK, and there are no other process or services
running on the machine (no MySQL server, no Apache, etc...). Without
special reason, Asterisk crashes with a segmentation fault error, this is
the content of the /var/log/messages file regarding the crashes. Sometimes
it takes 2-3 hours to crash, but yestearde we got 3 crashes in less than 10
minutes:

Sep 10 16:32:19 pbx kernel: asterisk[20940] general protection rip:433aff
rsp:40bf2128 error:0
Sep 10 18:46:14 pbx kernel: asterisk[17391] general protection rip:433aff
rsp:40a12128 error:0
Sep 10 19:01:09 pbx kernel: asterisk[21651]: segfault at 00002ae254525f50
rip 00002ae254525f50 rsp 0000000040d1e128 error 15
Sep 10 19:03:07 pbx kernel: asterisk[21699]: segfault at 00000000000000b8
rip 0000000000433aff rsp 00000000406c6958 error 4
Sep 10 19:06:53 pbx kernel: asterisk[22497]: segfault at 00000000000000b8
rip 0000000000433aff rsp 00000000406c6958 error 4

Do not hesitate to contact us in case you need further information. I
attach several GDB results. 

Issue History 
Date Modified   Username       Field                    Change               
====================================================================== 
09-11-07 05:39  Ted Brown      Note Added: 0070305                          
======================================================================




More information about the asterisk-bugs mailing list