[asterisk-bugs] [Asterisk 0010605]: 'Unknown' member status in app_queue
noreply at bugs.digium.com
noreply at bugs.digium.com
Thu Oct 11 10:18:55 CDT 2007
A NOTE has been added to this issue.
======================================================================
http://bugs.digium.com/view.php?id=10605
======================================================================
Reported By: jfitzgibbon
Assigned To:
======================================================================
Project: Asterisk
Issue ID: 10605
Category: Applications/app_queue
Reproducibility: random
Severity: major
Priority: normal
Status: new
Asterisk Version: 1.4.10.1
SVN Branch (only for SVN checkouts, not tarball releases): N/A
SVN Revision (number only!):
Disclaimer on File?: N/A
Request Review:
======================================================================
Date Submitted: 08-30-2007 08:23 CDT
Last Modified: 10-11-2007 10:18 CDT
======================================================================
Summary: 'Unknown' member status in app_queue
Description:
Without any obvious trigger, a large number of the members in all of my
queues changed to a status of 'Unknown':
cs_billing has 11 calls (max unlimited) in 'rrmemory' strategy (114s
holdtime), W:0, C:1771, A:58, SL:93.8% within 60s
Members:
SIP/1405 (dynamic) (Unknown) has taken no calls yet
SIP/1420 (dynamic) (paused) (Not in use) has taken no calls yet
SIP/1442 (dynamic) (paused) (Unknown) has taken 3 calls (last was
1523 secs ago)
SIP/1440 (dynamic) (In use) has taken 4 calls (last was 1262 secs
ago)
SIP/1428 (dynamic) (paused) (Not in use) has taken 7 calls (last was
2969 secs ago)
SIP/1404 (dynamic) (paused) (Not in use) has taken 7 calls (last was
2918 secs ago)
SIP/1429 (dynamic) (paused) (Unknown) has taken 17 calls (last was 5
secs ago)
SIP/1432 (dynamic) (Unavailable) has taken 17 calls (last was 965
secs ago)
SIP/1430 (dynamic) (In use) has taken 15 calls (last was 3506 secs
ago)
SIP/1435 (dynamic) (In use) has taken 17 calls (last was 1808 secs
ago)
SIP/1434 (dynamic) (Unavailable) has taken 19 calls (last was 827
secs ago)
SIP/1424 (dynamic) (In use) has taken 24 calls (last was 1277 secs
ago)
SIP/1408 (dynamic) (paused) (Not in use) has taken 22 calls (last
was 2770 secs ago)
SIP/1203 (dynamic) (In use) has taken 16 calls (last was 1730 secs
ago)
SIP/1410 (dynamic) (Unknown) has taken 20 calls (last was 292 secs
ago)
Callers:
1. Zap/60-1 (wait: 8:51, prio: 0)
2. Zap/65-1 (wait: 6:04, prio: 0)
3. Zap/71-1 (wait: 5:50, prio: 0)
4. Zap/69-1 (wait: 5:22, prio: 0)
5. Zap/26-1 (wait: 4:51, prio: 0)
6. Zap/28-1 (wait: 4:14, prio: 0)
7. Zap/27-1 (wait: 3:33, prio: 0)
8. Zap/30-1 (wait: 2:45, prio: 0)
9. Zap/33-1 (wait: 1:58, prio: 0)
10. Zap/34-1 (wait: 1:48, prio: 0)
11. Zap/35-1 (wait: 1:21, prio: 0)
This has happened once before (when we were running 1.4.9) just over a
month ago. I was unable to reproduce the behaviour in a lab environment.
When this happens, ringinuse=no stops being effective (because 'Unknown'
members are considered available to take a call. app_queue starts to
dequeue calls to agents who are already on a call. The SIP channels of the
agents have a call-limit of 2, so when this happens the log fills up with:
pbxtel-01*CLI>
[Aug 29 16:43:08] ERROR[22762]: chan_sip.c:3169 update_call_counter: Call
to peer '1410' rejected due to usage limit of 2
-- Couldn't call SIP/1410
pbxtel-01*CLI>
[Aug 29 16:43:08] ERROR[22762]: chan_sip.c:3169 update_call_counter: Call
to peer '1429' rejected due to usage limit of 2
-- Couldn't call SIP/1429
pbxtel-01*CLI>
[Aug 29 16:43:09] ERROR[22851]: chan_sip.c:3169 update_call_counter: Call
to peer '1429' rejected due to usage limit of 2
-- Couldn't call SIP/1429
pbxtel-01*CLI>
[Aug 29 16:43:09] ERROR[22851]: chan_sip.c:3169 update_call_counter: Call
to peer '1410' rejected due to usage limit of 2
-- Couldn't call SIP/1410
pbxtel-01*CLI>
[Aug 29 16:43:09] ERROR[22712]: chan_sip.c:3169 update_call_counter: Call
to peer '1429' rejected due to usage limit of 2
-- Couldn't call SIP/1429
pbxtel-01*CLI>
[Aug 29 16:43:09] ERROR[22712]: chan_sip.c:3169 update_call_counter: Call
to peer '1410' rejected due to usage limit of 2
-- Couldn't call SIP/1410
Attempts to have remove and add agents does not fix things - they go back
into an Unknown state as soon as they have completed a call.
The only way I could resolve the issue was to restart Asterisk. I killed
the running process to generate a core file, which is attached. The
tarball also contains a full backtrace and a copy of the asterisk binary,
which is from a 'Linux pbxtel-01.comwave 2.6.9-55.ELsmp
http://bugs.digium.com/view.php?id=1 SMP Wed May 2
14:04:42 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux' system running CentOS
4.5.
There is nothing obvious in the logs preceeding the trouble to indicate
why agents were marked as 'Unknown'.
======================================================================
----------------------------------------------------------------------
jfitzgibbon - 10-11-07 10:18
----------------------------------------------------------------------
Management has me locked to 1.4.7.1 (which has not exhibited *any* problems
since we rolled back), so I can't tell if things have been fixed in
1.4.12/.13/SVN.
The plan right now is to wait for 1.4 ABE. I know that doesn't help
squash this in any way, but since the bug (and others I have with
app_queue) only manifest under production load, my hands are tied.
This should probably be closed; if I at some point in the future get
permission to try later revisions I'll re-open it or re-file, whatever
Mantis allows.
Thanks
Issue History
Date Modified Username Field Change
======================================================================
10-11-07 10:18 jfitzgibbon Note Added: 0071817
======================================================================
More information about the asterisk-bugs
mailing list