[asterisk-dev] [Code Review] 3368: chan_sip segfault: INTERNAL_OBJ at astobj2.c:120

Mark Michelson reviewboard at asterisk.org
Thu Mar 20 10:51:36 CDT 2014


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/3368/#review11297
-----------------------------------------------------------

Ship it!


I agree that this is workaround more than it is the "correct" fix to be applying, but after going through the scenarios in my head, this looks like a good way to get around the problem.

- Mark Michelson


On March 17, 2014, 2:21 p.m., one47 wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/3368/
> -----------------------------------------------------------
> 
> (Updated March 17, 2014, 2:21 p.m.)
> 
> 
> Review request for Asterisk Developers.
> 
> 
> Bugs: ASTERISK-22079
>     https://issues.asterisk.org/jira/browse/ASTERISK-22079
> 
> 
> Repository: Asterisk
> 
> 
> Description
> -------
> 
> As requested, here is a workaround (not a fix) for the SIP SEGV caused by a scheduler race condition.
> 
> If a provisional keepalive is simultaneously rescheduled/cancelled, and executed by 2 parallel threads, then a scheduled event can be leaked.
> 
> The correct fix will probably involve a re-factoring of the scheduler so that a scheduled-job-reference is held by the owner, rather than just a scheduled-job-id as at present. That is outside the scope of this fix, which simply re-checks that the scheduler-id is unchanged after the lock has been obtained when running the scheduled job.
> 
> There is probably scope for doing this is several other scheduled function calls.
> 
> 
> Diffs
> -----
> 
>   /tags/1.8.25.0/channels/chan_sip.c 408955 
> 
> Diff: https://reviewboard.asterisk.org/r/3368/diff/
> 
> 
> Testing
> -------
> 
> Run on live server for several weeks.
> 
> Tested on load-test environment with following dialplan, which previously caused a crash in < 30 mins at 1 call per second.
> 
> exten => 900,1,NoOp(Crash Generator)
> same  =>     n,Ringing
> same  =>     n,Wait(60)
> same  =>     n,Progress
> same  =>     n,Hangup
> 
> 
> Thanks,
> 
> one47
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20140320/9e21ed86/attachment.html>


More information about the asterisk-dev mailing list