[asterisk-bugs] [Asterisk 0006176]: [patch] Segfault in pbx_builtin_setvar_helper. Looks like the channel is being closed down underneath us
Asterisk Bug Tracker
noreply at bugs.digium.com
Wed Feb 11 14:55:51 CST 2009
A NOTE has been added to this issue.
======================================================================
http://bugs.digium.com/view.php?id=6176
======================================================================
Reported By: stevedavies
Assigned To: Corydon76
======================================================================
Project: Asterisk
Issue ID: 6176
Category: Applications/app_macro
Reproducibility: random
Severity: crash
Priority: normal
Status: closed
Asterisk Version: SVN
Regression: No
SVN Branch (only for SVN checkouts, not tarball releases): trunk
SVN Revision (number only!): 9156
Request Review:
Resolution: fixed
Fixed in Version:
======================================================================
Date Submitted: 2006-01-09 04:09 CST
Last Modified: 2009-02-11 14:55 CST
======================================================================
Summary: [patch] Segfault in pbx_builtin_setvar_helper.
Looks like the channel is being closed down underneath us
Description:
Site had a segfault. gdb backtraces attached.
I had a look at what happened...
Segfault was in pbx_builtin_setvar_helper, whilst executing in the
AST_LIST_REMOVE macro.
pbx_builtin_setvar_helper was called by macro_exec, which was trying to
set MACRO_DEPTH on the channel.
Guilty code at pbx.c around line 5751 (SVN trunk, r7839):
AST_LIST_TRAVERSE (headp, newvariable, entries) {
if (strcasecmp(ast_var_name(newvariable), nametail) == 0)
{
/* there is already such a variable, delete it */
AST_LIST_REMOVE(headp, newvariable, entries);
ast_var_delete(newvariable);
break;
}
}
headp here points at varshead in the channel structure.
Apparently at the time the list was traversed, there must have been
variables on the channel - given that an existing MACRO_DEPTH was found to
delete.
However, by the time we got into the AST_LIST_REMOVE code, curelm was null
(which comes from (head)->first, i.e. varshead.first).
(gdb) frame 0
http://bugs.digium.com/view.php?id=0 0x080900a2 in pbx_builtin_setvar_helper
(chan=0x88d8a30, name=0x72734e
"MACRO_DEPTH", value=0xb79ee2b0 "1") at pbx.c:5941
5941 AST_LIST_REMOVE(headp, newvariable,
entries);
(gdb) p curelm
$2 = (struct ast_var_t *) 0x0
Here's AST_LIST_REMOVE:
#define AST_LIST_REMOVE(head, elm, field) do { \
if ((head)->first == (elm)) { \
(head)->first = (elm)->field.next; \
} \
else { \
typeof(elm) curelm = (head)->first; \
while (curelm->field.next != (elm)) \
curelm = curelm->field.next; \
curelm->field.next = (elm)->field.next; \
} \
if ((head)->last == elm) \
(head)->last = NULL; \
} while (0)
So my theory is that the channel was busy being dismantled as we tried to
remove the variable.
Here's the channel at the time of the segfault:
(gdb) frame 1
http://bugs.digium.com/view.php?id=1 0x00726701 in macro_exec (chan=0x88d8a30,
data=0xb79f27e0) at
app_macro.c:253
253 pbx_builtin_setvar_helper(chan, "MACRO_DEPTH", depthc);
(gdb) p *chan
$3 = {name = "Parking/Local/91001 at from-internal-74b8,2<ZOMBIE>", '\0'
<repeats 31 times>, tech = 0x80f4600, tech_pvt = 0x0,
language = "en", '\0' <repeats 17 times>, type = 0x7ee627 "Local", fds =
{-1, -1, -1, -1, -1, -1, 172, -1},
musicclass = '\0' <repeats 19 times>, music_state = 0xb6602dd8,
generatordata = 0x0, generator = 0x0, _bridge = 0x0, masq = 0x0,
masqr = 0x0, cdrflags = 0, _softhangup = 1, whentohangup = 0, blocker =
3080698800, lock = {__m_reserved = 0, __m_count = 0,
__m_owner = 0x0, __m_kind = 1, __m_lock = {__status = 0, __spinlock =
0}}, blockproc = 0x80f4a24 "ast_waitfor_nandfds",
appl = 0xb7b437b0 "Macro", data = 0xb79f27e0 "dial|30|tTrh|91001", fdno
= 6, sched = 0x8a3f2c8, streamid = -1, stream = 0x0,
vstreamid = 0, vstream = 0x0, oldwriteformat = 64, timingfd = 172,
timingfunc = 0, timingdata = 0x0, _state = 0, rings = 0,
nativeformats = 8, readformat = 8, writeformat = 8, cid = {cid_dnid =
0x0, cid_num = 0x0, cid_name = 0x0, cid_ani = 0x0,
cid_rdnis = 0x0, cid_pres = 0, cid_ani2 = 0, cid_ton = 0, cid_tns =
0},
context = "macro-dial\000-vm\000able", '\0' <repeats 60 times>,
macrocontext = "from-internal", '\0' <repeats 66 times>,
macroexten = "91001", '\0' <repeats 74 times>, macropriority = 1, exten
= "s\000001", '\0' <repeats 74 times>, priority = 10,
dtmfq = '\0' <repeats 79 times>, dtmff = {frametype = 0, subclass = 0,
datalen = 0, samples = 0, mallocd = 0, offset = 0, src = 0x0,
data = 0x0, delivery = {tv_sec = 0, tv_usec = 0}, prev = 0x0, next =
0x0}, pbx = 0x890bcf8, amaflags = 3,
accountcode = '\0' <repeats 19 times>, cdr = 0x88f6bc0, adsicpe = 0,
call_forward = '\0' <repeats 79 times>, zone = 0x0,
monitor = 0x0, insmpl = 0, outsmpl = 0, fin = 2969, fout = 1174,
uniqueid = "1136789367.968", '\0' <repeats 17 times>,
hangupcause = 16, varshead = {first = 0x0, last = 0x0}, callgroup = 0,
pickupgroup = 0, flags = 528, transfercapability = 0,
readq = 0x88a7a58, alertpipe = {-1, -1}, writetrans = 0x0, readtrans =
0x0, rawreadformat = 0, rawwriteformat = 0, spies = 0x88d9560,
next = 0x88ad3c0}
Notice the name. Notice, also, that the varshead.first is null. Notice,
lastly, that _softhangup == 1.
My proposed fix would be two-phase.
1) Make AST_LIST_REMOVE (and friends?) more defensive.
2) Investigate adding locking to avoid this potential race.
Regards,
Steve Davies
======================================================================
----------------------------------------------------------------------
(0099937) svnbot (reporter) - 2009-02-11 14:55
http://bugs.digium.com/view.php?id=6176#c99937
----------------------------------------------------------------------
Repository: asterisk
Revision: 174886
_U trunk/
------------------------------------------------------------------------
r174886 | tilghman | 2009-02-11 14:55:47 -0600 (Wed, 11 Feb 2009) | 19
lines
Blocked revisions 174885 via svnmerge
........
r174885 | tilghman | 2009-02-11 14:54:18 -0600 (Wed, 11 Feb 2009) | 13
lines
Restore a behavior that was recently changed, when we fixed issue
http://bugs.digium.com/view.php?id=13962
and
issue http://bugs.digium.com/view.php?id=13363 (related to issue
http://bugs.digium.com/view.php?id=6176). When a hangup occurs during a
Macro
execution in earlier 1.4, the h extension would execute within the Macro
context, whereas it was always supposed to execute only within the main
context
(where Macro was called). So this fix checks for an "h" extension in
the
deepest macro context where a hangup occurred; if it exists, that "h"
extension
executes, otherwise the main context "h" is executed.
(closes issue http://bugs.digium.com/view.php?id=14122)
Reported by: wetwired
Patches:
20090210__bug14122.diff.txt uploaded by Corydon76 (license 14)
Tested by: andrew
........
------------------------------------------------------------------------
http://svn.digium.com/view/asterisk?view=rev&revision=174886
Issue History
Date Modified Username Field Change
======================================================================
2009-02-11 14:55 svnbot Checkin
2009-02-11 14:55 svnbot Note Added: 0099937
======================================================================
More information about the asterisk-bugs
mailing list