[asterisk-dev] [Code Review] 2594: IAX2: fix race condition when transferrring.

Matt Jordan reviewboard at asterisk.org
Fri Jun 7 22:34:19 CDT 2013


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/2594/#review8864
-----------------------------------------------------------



branches/11/channels/chan_iax2.c
<https://reviewboard.asterisk.org/r/2594/#comment17410>

    I'm not sure I understand why this change was made.
    
    I can see the race condition without first obtaining the lock; however, why should the behavior of TRANSFER_READY be changed? With this code change, a TRANSFER_READY doesn't transition to TRANSFER_MEDIA. Additionally, we don't handle the 'any other state is a release' condition any longer.
    
    This change may be completely correct, but based on the problem description, I'm not sure why it's needed or what other side effects it may have.


- Matt Jordan


On June 5, 2013, 10:52 a.m., Alec Davis wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/2594/
> -----------------------------------------------------------
> 
> (Updated June 5, 2013, 10:52 a.m.)
> 
> 
> Review request for Asterisk Developers.
> 
> 
> Bugs: ASTERISK-21409
>     https://issues.asterisk.org/jira/browse/ASTERISK-21409
> 
> 
> Repository: Asterisk
> 
> 
> Description
> -------
> 
> 1). When touching the bridgecallno, we need to lock it.
> 
> 2). stop_stuff() which calls iax2_destroy_helper()
>     Assumes the lock on the pvt is already held, when iax2_destroy_helper() is called.
>     Thus we need to lock the bridgecallno pvt before we call stop_stuff(iaxs[fr->callno]->bridgecallno);
> 
> 3). When evaluating the state of 'callno->transferring' of the current leg, we can't change it to READY unless the bridgecallno is locked.
>     Why, if we are interrupted by the other call leg before 'transferring = TRANSFER_RELEASED', the interrupt will find that it is READY and that the bridgecallno is also READY so Releases the legs.
>     Then the first call leg in this example, finishes execution, and Releases the legs AGAIN!!!!!!!
>     Interleaving thread execution gets interesting as well - see timeline below from June 5. 
> 
> Debug captures when it went wrong.
> [May 31 14:44:01] VERBOSE[30820] chan_iax2.c:     -- Channel 'IAX2/auckland-13262' ready to transfer
> [May 31 14:44:01] VERBOSE[30824][C-00000536] chan_iax2.c:     -- Channel 'IAX2/auckland-20457' ready to transfer
> [May 31 14:44:01] VERBOSE[30824][C-00000536] chan_iax2.c:     -- Releasing IAX2/auckland-20457 and IAX2/auckland-13262
> [May 31 14:44:01] VERBOSE[30820] chan_iax2.c:     -- Releasing IAX2/auckland-13262 and IAX2/auckland-20457
> [May 31 14:44:01] DEBUG[30824][C-00000536] sched.c: Attempted to delete nonexistent schedule entry 209951!
> [May 31 14:44:01] ERROR[30824][C-00000536] lock.c: chan_iax2.c line 1918 (iax2_destroy_helper): mutex '&iaxsl[pvt->callno]' freed more times than we've locked!
> [May 31 14:44:01] ERROR[30824][C-00000536] lock.c: chan_iax2.c line 1918 (iax2_destroy_helper): Error releasing mutex: Operation not permitted
> 
> [Jun  5 19:53:43] VERBOSE[25606][C-00000000] chan_iax2.c:     -- Channel 'IAX2/auckland-19065' ready to transfer
> [Jun  5 19:53:43] VERBOSE[25604][C-00000000] chan_iax2.c:     -- Channel 'IAX2/auckland-20047' ready to transfer
> [Jun  5 19:53:43] VERBOSE[25606][C-00000000] chan_iax2.c:     -- Releasing IAX2/auckland-19065 and IAX2/auckland-20047
> [Jun  5 19:53:43] VERBOSE[25604][C-00000000] chan_iax2.c:     -- Releasing IAX2/auckland-20047 and IAX2/auckland-19065
> [Jun  5 19:53:43] VERBOSE[25606][C-00000000] chan_iax2.c:     -- Channel 'IAX2/auckland-19065' finished transfer
> [Jun  5 19:53:43] DEBUG[25604][C-00000000] sched.c: Attempted to delete nonexistent schedule entry 17!
> [Jun  5 19:53:43] VERBOSE[25604][C-00000000] chan_iax2.c:     -- Channel 'IAX2/auckland-20047' finished transfer
> 
> 
> A time line of 2 threads interleaving that shows how the Jun 5 capture may have come about.
> The execution path seems to switch threads as we print/log data.
> thread 25606                thread 25604
> ======================      =======================
> if US == BEGIN
>    US = READY
>    "ready to transfer"
>                             if US == BEGIN
>                                US = READY
>                                "ready to transfer"
>     if THEY == READY
>        "Releasing .."
>                                if THEY == READY
>                                   "Releasing .."
>        THEY = RELEASED
>        US = RELEASED
>        stopstuff(US)
>        stopstuff(THEM)
>        "finished transfer"
> 
>                                    THEY = RELEASED
>                                    US = RELEASED
>                                    stopstuff(US)
>                                    stopstuff(THEM) (sched.c: Attempted to delete nonexistent schedule entry 17!)
>                                    "finished transfer"
> 
> 
> Diffs
> -----
> 
>   branches/11/channels/chan_iax2.c 387927 
> 
> Diff: https://reviewboard.asterisk.org/r/2594/diff/
> 
> 
> Testing
> -------
> 
> Yes. Between sites. many test calls and real calls being transferred back down the same trunk
> 
> 
> Thanks,
> 
> Alec Davis
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20130608/978c097a/attachment.htm>


More information about the asterisk-dev mailing list