[Asterisk-code-review] res pjsip pubsub: Address SEGV when attempting to terminate ... (asterisk[13])

George Joseph asteriskteam at digium.com
Fri Jun 17 09:32:42 CDT 2016


Hello Mark Michelson, Richard Mudgett, Anonymous Coward #1000019,

I'd like you to reexamine a change.  Please visit

    https://gerrit.asterisk.org/3018

to look at the new patch set (#5).

Change subject: res_pjsip_pubsub: Address SEGV when attempting to terminate a subscription
......................................................................

res_pjsip_pubsub: Address SEGV when attempting to terminate a subscription

Occasionally under load we'll attempt to send a final NOTIFY on a
subscription that's already been terminated and a SEGV will occur
down in pjproject's evsub_destroy function.  This is a result of a
race condition between all the paths that can generate a notify
and/or destroy the underlying pjproject evsub object:

 * The client can send a SUBSCRIBE with Expires: 0.
 * The client can send a SUBSCRIBE/refresh.
 * The subscription timer can expire.
 * An extension state can change.
 * An MWI event can be generated.
 * The pjproject transaction timer (timer_b) can expire.

Normally when our pubsub_on_evsub_state is called with a terminate,
we push a task to the serializer and return at which point the dialog
is unlocked.  This is usually not a problem because the task runs
immediately and locks the dialog again.  When the system is heavily
loaded though, there may be a delay between the unlock and relock
during which another event may occur such as the subscription timer
or timer_b expiring, an extension state change, etc.  These may also
cause a terminate to be processed and if so, we could cause pjproject
to try to destroy the evsub structure twice.  There's no way for us to
tell that the evsub was already destroyed and the evsub's group lock
can't tolerate this and SEGVs.

The remedy is twofold.

 * A patch has been submitted to Teluu and added to the bundled
   pjproject which sets a new state PJSIP_EVSUB_STATE_DESTROYED
   when the evsub structure has been destroyed.  "Destroyed" in this
   case only means that the last reference to the group lock has been
   released and the evsub structure has been removed from it's parent
   dialog.  No memory is actually freed.
 * res_pjsip_pubsub now tests for the new state before performing any
   actions.  Since this is a new pjproject feature that we can't depend
   on res_pjsip_pubsub also keeps better track of whether it's
   already processing a terminate.  While the better tracking can't
   prevent all the SEGVs by itself, it greatly reduces the chances
   if the PJSIP_EVSUB_STATE_DESTROYED capability isn't available.

ASTERISK-26099 #close
Reported-by: Ross Beer.

Change-Id: I779d11802cf672a51392e62a74a1216596075ba1
---
M configure
M configure.ac
M include/asterisk/autoconfig.h.in
M res/res_pjsip_pubsub.c
M third-party/pjproject/configure.m4
A third-party/pjproject/patches/0001-evsub-Add-state-PJSIP_EVSUB_STATE_DESTROYED.patch
6 files changed, 328 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.asterisk.org:29418/asterisk refs/changes/18/3018/5
-- 
To view, visit https://gerrit.asterisk.org/3018
To unsubscribe, visit https://gerrit.asterisk.org/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I779d11802cf672a51392e62a74a1216596075ba1
Gerrit-PatchSet: 5
Gerrit-Project: asterisk
Gerrit-Branch: 13
Gerrit-Owner: George Joseph <gjoseph at digium.com>
Gerrit-Reviewer: Anonymous Coward #1000019
Gerrit-Reviewer: George Joseph <gjoseph at digium.com>
Gerrit-Reviewer: Mark Michelson <mmichelson at digium.com>
Gerrit-Reviewer: Richard Mudgett <rmudgett at digium.com>



More information about the asterisk-code-review mailing list