[Asterisk-code-review] res pjsip pubsub: Add more locking and checking around NOTIFY (asterisk[13])

George Joseph asteriskteam at digium.com
Tue Jun 14 13:38:16 CDT 2016


Hello Richard Mudgett, Anonymous Coward #1000019,

I'd like you to reexamine a change.  Please visit

    https://gerrit.asterisk.org/3018

to look at the new patch set (#3).

Change subject: res_pjsip_pubsub: Add more locking and checking around NOTIFY
......................................................................

res_pjsip_pubsub: Add more locking and checking around NOTIFY

Occasionally under load we'll attempt to send a final NOTIFY on a
subscription that's already been terminated and a SEGV will occur
down in pjproject's evsub_destroy function.  This is a result of a
race condition between all the paths that can generate a notify
and/or destroy the underlying pjproject evsub object:

 * The client can send a SUBSCRIBE with Expires: 0.
 * The client can send a SUBSCRIBE/refresh.
 * The subscription timer can expire.
 * An extension state can change.
 * An MWI event can be generated.
 * The pjproject transaction timer (timer_b) can expire.

Unfortunately, we're not consistently locking the dialog or serializing
the sending of the NOTIFY and since the above paths can be taken by
any number of threads, occasionally 2 of the paths will interleave
and cause the destruction of the pjproject evsub object before its
time.  Understanding the exact path that causes the SEGV is not practical
given time constraints but the following changes are reasonable and do
prevent the SEGV.

Changes:
 * ast_sip_subscription_notify now always serializes the send_notify
   task instead of calling it directly in the callers thread.  This
   is the entry point for extension state and mwi changes.
 * Contrary to the comments, the dialog may NOT be locked when
   our pubsub_on_evsub_state callback is called.  We now lock it
   ourselves before checking the last_notify flag and we now set the
   last_notify flag directly after the test.  This handles client
   initiated termination.
 * The dialog is also not locked when our pubsub_on_rx_refresh is
   called so we now lock it ourselves.
 * The dialog IS locked when our pubsub_on_server_timeout is called
   but it wasn't checking or setting the last_notify flag, which it
   does now.

ASTERISK-26099 #close
Reported-by: Ross Beer.

Change-Id: I779d11802cf672a51392e62a74a1216596075ba1
---
M res/res_pjsip_pubsub.c
1 file changed, 80 insertions(+), 34 deletions(-)


  git pull ssh://gerrit.asterisk.org:29418/asterisk refs/changes/18/3018/3
-- 
To view, visit https://gerrit.asterisk.org/3018
To unsubscribe, visit https://gerrit.asterisk.org/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I779d11802cf672a51392e62a74a1216596075ba1
Gerrit-PatchSet: 3
Gerrit-Project: asterisk
Gerrit-Branch: 13
Gerrit-Owner: George Joseph <gjoseph at digium.com>
Gerrit-Reviewer: Anonymous Coward #1000019
Gerrit-Reviewer: George Joseph <gjoseph at digium.com>
Gerrit-Reviewer: Richard Mudgett <rmudgett at digium.com>



More information about the asterisk-code-review mailing list