[asterisk-dev] [Code Review] 4174: Fix race condition where identical SIP requests are processed by multiple threads.

Mark Michelson reviewboard at asterisk.org
Fri Nov 14 08:24:06 CST 2014


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/4174/
-----------------------------------------------------------

(Updated Nov. 14, 2014, 8:24 a.m.)


Status
------

This change has been marked as submitted.


Review request for Asterisk Developers.


Changes
-------

Committed in revision 427841


Repository: Asterisk


Description
-------

During testing, an odd situation was encountered. In the test, a phone sent an INVITE to Asterisk. A half second later, after receiving no response, the phone retransmits the INVITE to Asterisk. About this time, Asterisk starts to process both incoming INVITEs at the same time in separate threads.

In thread 1, a dialog is successfully created, a 100 Trying response is sent, and the call is sent into the dialplan.
In thread 2, the dialog cannot be created because thread 1 has already created a transaction in PJSIP with the same details. The collision results in thread 2 sending a 500 response to the phone.

At this point, the phone has received an error final response, so the phone assumes the call is failed. However, Asterisk has a successful dialog going, still, so Asterisk continues on with the call. This results in some "fun" situations. Luckily, the situations haven't proven fatal for Asterisk, but they are very confusing for people involved in the calls.

The solution proposed to fix this problem is to not respond to incoming requests if attempting to create a transaction results in the PJ_EEXISTS error. The logic is that if PJ_EEXISTS is returned, that means that elsewhere, we have already successfully created a transaction for this request and we can safely ignore this one.

After auditing the code, the only places that required changes were the places that created dialogs based on incoming requests. Places that create out-of-dialog stateful responses were not reacting to errors by sending stateless responses.

The actual change implemented here is to modify ast_sip_create_dialog_uas() to take an additional parameter that is the status returned from PJSIP when attempting to create the dialog. This way, we can react accordingly if the dialog cannot be created. The Asterisk 12 changes are presented here. The Asterisk 13 changes are on /r/4175


Diffs
-----

  /branches/12/res/res_pjsip_session.c 427735 
  /branches/12/res/res_pjsip_pubsub.c 427735 
  /branches/12/res/res_pjsip.c 427735 
  /branches/12/include/asterisk/res_pjsip.h 427735 

Diff: https://reviewboard.asterisk.org/r/4174/diff/


Testing
-------

I ran the testsuite nominal incoming calls tests and the presence subscription tests to be sure that this change did not adversely affect them. They still pass.


Thanks,

Mark Michelson

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20141114/f6ddccf3/attachment.html>


More information about the asterisk-dev mailing list