[asterisk-bugs] [Asterisk 0010199]: Random replacement of channel name with other text in queue log entries
noreply at bugs.digium.com
noreply at bugs.digium.com
Fri Aug 10 09:04:05 CDT 2007
A NOTE has been added to this issue.
======================================================================
http://bugs.digium.com/view.php?id=10199
======================================================================
Reported By: jfitzgibbon
Assigned To: putnopvut
======================================================================
Project: Asterisk
Issue ID: 10199
Category: Applications/app_queue
Reproducibility: random
Severity: major
Priority: normal
Status: feedback
Asterisk Version: 1.4.7.1
SVN Branch (only for SVN checkouts, not tarball releases): N/A
SVN Revision (number only!):
Disclaimer on File?: No
Request Review:
======================================================================
Date Submitted: 07-13-2007 08:44 CDT
Last Modified: 08-10-2007 09:04 CDT
======================================================================
Summary: Random replacement of channel name with other text
in queue log entries
Description:
On 1.4.5, 1.4.6 and 1.4.7.1, I have observed random replacement of the
channel name with other text in my queue log. Typically the other text is
part of a manager event, suggesting that a pointer is getting corrupted
somewhere. It happens on a very small percentage of calls, but there are
consistent elements to the corruption from my observations:
- it only happens on one of my eleven queues. Of the eleven, four others
are configured almost identically to the queue on which the corruption is
observed (the only differences are queue name and wrapuptime length). The
problem queue is listed second in queues.conf, and does not sort lexically
to the top or bottom of a list of queue names.
- the corruption first appears on the CONNECT event
- the channel name remains corrupted for the duration of the call (i.e.
you never see a good ENTERQUEUE, bad CONNECT, then a good COMPLETECALLER)
- the replacement text is not consistent. It can differ between the
CONNECT and TRANSFER events (see example 1)
- if the call is transferred to an extension which enqueues the caller to
another queue, the corruption is cleared, and does not typically re-appear
(see example 1)
I cannot reproduce the bug on demand in my lab environment. On a call
center fielding about 5000 calls per day, I see this on 5-10 of those
calls. The main impact of this is that queue analysis programs never see
the CONNECT or COMPLETEXXXXXX events, so they think that any corrupted call
is waiting forever. Having such a large wait time for several calls on a
queue knocks statistics out of whack.
======================================================================
----------------------------------------------------------------------
jfitzgibbon - 08-10-07 09:04
----------------------------------------------------------------------
Having upgraded a few times since I last posted on this ticket, we're still
seeing it, but some days we don't see it at all (which I can't really
explain). Here's the distribution for the last 28 days:
[root at pbxtel-01 james]# zgrep -c '^[|]' /var/log/asterisk/queue_log.*.gz |
sort
/var/log/asterisk/queue_log.1.gz:6
/var/log/asterisk/queue_log.2.gz:0
/var/log/asterisk/queue_log.3.gz:20
/var/log/asterisk/queue_log.4.gz:7
/var/log/asterisk/queue_log.5.gz:0
/var/log/asterisk/queue_log.6.gz:0
/var/log/asterisk/queue_log.7.gz:0
/var/log/asterisk/queue_log.8.gz:4
/var/log/asterisk/queue_log.9.gz:7
/var/log/asterisk/queue_log.10.gz:6
/var/log/asterisk/queue_log.11.gz:5
/var/log/asterisk/queue_log.12.gz:0
/var/log/asterisk/queue_log.13.gz:1
/var/log/asterisk/queue_log.14.gz:3
/var/log/asterisk/queue_log.15.gz:2
/var/log/asterisk/queue_log.16.gz:0
/var/log/asterisk/queue_log.17.gz:2
/var/log/asterisk/queue_log.18.gz:2
/var/log/asterisk/queue_log.19.gz:0
/var/log/asterisk/queue_log.20.gz:3
/var/log/asterisk/queue_log.21.gz:6
/var/log/asterisk/queue_log.22.gz:5
/var/log/asterisk/queue_log.23.gz:9
/var/log/asterisk/queue_log.24.gz:6
/var/log/asterisk/queue_log.25.gz:46
/var/log/asterisk/queue_log.26.gz:3
/var/log/asterisk/queue_log.27.gz:15
/var/log/asterisk/queue_log.28.gz:3
[root at pbxtel-01 james]#
And here's the detail for yesterday:
[root at pbxtel-01 james]# zgrep '^[^0-9]' /var/log/asterisk/queue_log.1.gz
|CONNECT|49|1186701799.8650
|TRANSFER|7202|from-internal-salesin|49|17
|CONNECT|47|1186702301.8751
|TRANSFER|7202|from-internal-salesout-manual|47|29
Context: macro-addtovar
Extension: s
Priority: 1
Application: GotoIf
AppData: 1?first
Uniqueid: 1186703272.8947
|TRANSFER|7202|from-internal-salesout-manual|14|32
Context: queuemetrics
Extension: pause
Priority: 3
Application: AGI
AppData: cw-agentpause.agi|--mode=pause|--agent=1402|--pausereason=32
Uniqueid: 1186709573.10249
|COMPLETEAGENT|18|349|1
[root at pbxtel-01 james]#
We're now running 1.4.10 (with app_queue from the 1.4 SVN branch to pick
up the fix for bug 10320)
Any idea what direction to take this in?
Issue History
Date Modified Username Field Change
======================================================================
08-10-07 09:04 jfitzgibbon Note Added: 0068714
======================================================================
More information about the asterisk-bugs
mailing list