[Asterisk-code-review] res pjsip pubsub: Add more locking and checking around termi... (asterisk[13])
George Joseph
asteriskteam at digium.com
Sun Jun 12 13:31:40 CDT 2016
George Joseph has uploaded a new patch set (#2).
Change subject: res_pjsip_pubsub: Add more locking and checking around terminates
......................................................................
res_pjsip_pubsub: Add more locking and checking around terminates
A subsciption can be terminated by either a SUBSCRIBE with Expires = 0
or by the subscription timer expiring. In heavily loaded systems
it's possible for the timer to fire at the same time we get a SUBSCRIBE
from the client either to terminate or refresh the subscription.
Client --> on_evsub_state push --|
serialized_pubsub_on_server_timeout
Timer --> on_server_timeout push --|
Both paths result in the pushing of a task to the subscription's
serializer, but on_server_timeout wasn't checking the last_notify flag
to see if a client-requested terminate was already in progress and
although on_evsub_state was checking it, it wasn't setting it.
This could result in 2 terminate tasks being placed on the serializer.
Since the last_notify flag was set only after the NOTIFY was sent
and the terminate task wasn't checking the flag either, both tasks
could run. Additionally, some of the other pubsub tasks weren't
locking the dialog or checking last_notify so they could also be
running while pjproject was running one of the callbacks in another
thread.
In the end, there just wasn't enough locking and state checking and
we were seeing SEGVs in the pjproject code usually related to the dialog
being invalid.
Some of the checking in this patch is probably overkill but they're
simple boolean or null pointer checks so they're cheap. Also some of
the debug levels in existing code were updated.
Changes:
* Debug levels were changed so 1 is infrequent possible errors and
3 is high-volume informational messages.
* The evsub_state and server_timeout pjproject callbacks now increment
last_notify while holding the dialog lock but before pushing the
terminate task on the serializer.
* The server_timeout callback now checks last_notify before pushing
the terminate task.
* The other callbacks and tasks now check that evsub, dlg and
last_notify are valid before doing anything and the tasks now lock
the dialog.
* The serialized_pubsub_on_server_timeout task was renamed to
serialized_pubsub_terminate and now checks the last_notify flag. If
it's > 1, then somehow 2 terminate tasks were pushed so this task
decrements the flag and returns. The second task will now see the
flag as 1 and will process.
ASTERISK-26099 #close
Reported-by: Ross Beer.
Change-Id: I779d11802cf672a51392e62a74a1216596075ba1
---
M res/res_pjsip_pubsub.c
1 file changed, 121 insertions(+), 46 deletions(-)
git pull ssh://gerrit.asterisk.org:29418/asterisk refs/changes/18/3018/2
--
To view, visit https://gerrit.asterisk.org/3018
To unsubscribe, visit https://gerrit.asterisk.org/settings
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I779d11802cf672a51392e62a74a1216596075ba1
Gerrit-PatchSet: 2
Gerrit-Project: asterisk
Gerrit-Branch: 13
Gerrit-Owner: George Joseph <gjoseph at digium.com>
More information about the asterisk-code-review
mailing list