[asterisk-bugs] [Asterisk 0009520]: realtime prune (and others) may segfault asterisk (timing issue)
noreply at bugs.digium.com
noreply at bugs.digium.com
Wed Mar 19 18:14:01 CDT 2008
A NOTE has been added to this issue.
======================================================================
http://bugs.digium.com/view.php?id=9520
======================================================================
Reported By: kryptolus
Assigned To: russell
======================================================================
Project: Asterisk
Issue ID: 9520
Category: Channels/chan_sip/Registration
Reproducibility: always
Severity: crash
Priority: normal
Status: assigned
Asterisk Version: SVN
SVN Branch (only for SVN checkouts, not tarball releases): 1.2
SVN Revision (number only!): 61305
Disclaimer on File?: Yes
Request Review:
======================================================================
Date Submitted: 04-11-2007 07:31 CDT
Last Modified: 03-19-2008 18:13 CDT
======================================================================
Summary: realtime prune (and others) may segfault asterisk
(timing issue)
Description:
The function expire_register doesn't do any locking before mucking with the
peer.
If it gets pre-empted, there's a chance that the peer might be destroyed
before the control returns to expire_register.
If you execute a "prune realtime peer" at the right time, asterisk will
segfault. It is not limited to just this however, as I have experienced
several segfaults with this signature without any intervention. However,
looking at the code I can only see a problem with the pruning code, I don't
see any possible issues with any other place in sip.
My patch queues up peers to be destroyed and they are ultimately destroyed
from the monitor thread which should guarantee that expire_register cannot
be running at the same time. The other alternative is to add a check to
expire_register to check if peer is still inside the peer list. However,
that has potential to impact performance because that check would block a
lot of things on every expire.
======================================================================
----------------------------------------------------------------------
murf - 03-19-08 18:13
----------------------------------------------------------------------
I see your point, but the PAIN, oh, the pain!
My main concerns are these:
(1) An overhaul of the code in 1.4 will be pretty intrusive. The chances
of introducing more bugs than we solve is extremely high.
(2) Next, the reason for death in this case: expire_register. In the
current code, asterisk has a concept of when to destroy objects (at least,
to a degree); why expire_register callbacks are left scheduled is probably
the true bug. Let me put it another way: the program is crashing because we
destroyed a peer, and some callback is left around that refs that peer?
Huh? I don't quite grok how or why we would 'partially' destroy an object,
and think it'd be OK for the struct to live a while longer because
something else is referencing it... especially a callback.... I have to
analyze this issue, and see how/why this is happening... The real answer
might not be the patch, but rather to re-arrange the code in
expire_register() so it isn't in the path to peer destruction...
Issue History
Date Modified Username Field Change
======================================================================
03-19-08 18:13 murf Note Added: 0084300
======================================================================
More information about the asterisk-bugs
mailing list