[asterisk-users] Asterisk SIP deadlocks - update_provisional_keepalive

Duane Larson duane.larson at gmail.com
Sat Apr 6 15:25:45 CDT 2013


Looks like version 11.3 did not fix my issue.

http://pastebin.com/gd291Bqz


On Thu, Apr 4, 2013 at 1:23 PM, Duane Larson <duane.larson at gmail.com> wrote:

> Thanks Jim.  Searched through the change log for "deadlock" but nothing
> really stuck out.  I'll upgrade to 11.3 and see if that makes a difference.
>
>
> On Thu, Apr 4, 2013 at 10:59 AM, Jim Lucas <lists at cmsws.com> wrote:
>
>> On 04/03/2013 08:15 PM, Duane Larson wrote:
>>
>>> So it just happened again on both machines at the same time and I was
>>> running debug on both servers.  I am running OpenSIPS and load balancing
>>> between both servers so I am guessing when the invite was sent to the
>>> first
>>> server it was frozen for some reason and then OpenSIPS sent the invite to
>>> the second server and that server was also frozen/deadlocked because of
>>> the
>>> SIP message.  I noticed on both servers the last log that was posted with
>>> Asterisk deadlocked was the following
>>>
>>>
>>> Asterisk version 11.0.1
>>> [Apr  3 21:39:42] DEBUG[12984] res_timing_timerfd.c: Expected to
>>> acknowledge 1 ticks but got 11805 instead
>>>
>>> Asterisk version 11.2.1
>>> [Apr  3 21:39:50] DEBUG[1854] res_timing_timerfd.c: Expected to
>>> acknowledge
>>> 1 ticks but got 12423 instead
>>>
>>>
>>> In my last email I posted the debug from the Asterisk server with 11.0.1
>>> version of code.  Here is a post of the debug for the Asterisk server
>>> with
>>> version 11.2.1
>>>
>>> http://pastebin.com/mbjSSAWM
>>>
>>>
>>> This has to be a bug right?  I am thinking of opening an issue on the
>>> Asterisk JIRA system
>>>
>>>
>> A number of deadlocks were fixed in the current release of 11.3.  Please
>> read the change log to see if any fit your issue.
>>
>> http://downloads.asterisk.org/**pub/telephony/asterisk/**
>> ChangeLog-11-current<http://downloads.asterisk.org/pub/telephony/asterisk/ChangeLog-11-current>
>>
>>
>>
>>>
>>> On Wed, Apr 3, 2013 at 4:45 PM, Duane Larson <duane.larson at gmail.com>
>>> wrote:
>>>
>>>  It just happened again on the 11.0.1 box and I was able to grab a debug.
>>>>   I am hoping someone can tell me if this is a bug or something wrong
>>>> with
>>>> my config.
>>>>
>>>> gdb asterisk-bin/sbin/asterisk 29048
>>>>
>>>> Go here for the debug output
>>>> http://pastebin.com/DGXx0BSk
>>>>
>>>>
>>>> On Tue, Apr 2, 2013 at 7:42 PM, Duane Larson <duane.larson at gmail.com
>>>> >wrote:
>>>>
>>>>  I am currently running two different versions of Asterisk
>>>>>
>>>>> 11.0.1
>>>>> 11.2.1
>>>>>
>>>>> I have noticed the bug occur on both servers.
>>>>>
>>>>> The issue is that when I try to dial a phone number sometimes the call
>>>>> will never go out.  I will check the Asterisk server with NGREP and see
>>>>> that the SIP messages are making it to Asterisk but Asterisk isn't
>>>>> responding.
>>>>>
>>>>> I do the following command "netstat -nap |grep 5060" and see that
>>>>> Asterisk has a lot under the "Recv-Q" column.
>>>>>
>>>>> It usually takes about 10 minutes before Asterisk becomes responsive
>>>>> again or else before 10 minutes is up I could restart Asterisk and
>>>>> everything will be back to normal.
>>>>>
>>>>> I see in the message logs the following errors
>>>>>
>>>>> On the 11.0.1 Asterisk server
>>>>> WARNING[23723][C-00000010] chan_sip.c: Unable to cancel schedule ID
>>>>> 11473.  This is probably a bug (chan_sip.c:
>>>>> update_provisional_keepalive,
>>>>> line 4406).
>>>>>
>>>>> On the 11.2.1 Asterisk server
>>>>> WARNING[3493][C-0000001f] chan_sip.c: Unable to cancel schedule ID
>>>>> 30810.
>>>>>   This is probably a bug (chan_sip.c: update_provisional_keepalive,
>>>>> line
>>>>> 4683).
>>>>>
>>>>>
>>>>> When I look in chan_sip.c on both servers I see that they are the same
>>>>> line of code
>>>>>
>>>>> AST_SCHED_DEL_UNREF(sched, pvt->provisional_keepalive_**sched_id,
>>>>> dialog_unref(pvt, "when you delete the provisional_keepalive_sched_**id,
>>>>> you
>>>>> should dec the refcount for the stored dialog ptr"));
>>>>>
>>>>>
>>>>>
>>>>> What could be causing this because it seems to happen at least once a
>>>>> day.
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> --
>>>> *--*--*--*--*--*
>>>> Duane
>>>> *--*--*--*--*--*
>>>> --
>>>>
>>>>
>>>
>>>
>>>
>>>
>>> --
>>> ______________________________**______________________________**
>>> _________
>>> -- Bandwidth and Colocation Provided by http://www.api-digital.com --
>>> New to Asterisk? Join us for a live introductory webinar every Thurs:
>>>                 http://www.asterisk.org/hello
>>>
>>> asterisk-users mailing list
>>> To UNSUBSCRIBE or update options visit:
>>>     http://lists.digium.com/**mailman/listinfo/asterisk-**users<http://lists.digium.com/mailman/listinfo/asterisk-users>
>>>
>>>
>>
>> --
>> Jim Lucas
>>
>> http://www.cmsws.com/
>> http://www.cmsws.com/examples/
>>
>
>
>
> --
> --
> *--*--*--*--*--*
> Duane
> *--*--*--*--*--*
> --
>



-- 
--
*--*--*--*--*--*
Duane
*--*--*--*--*--*
--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-users/attachments/20130406/6ca79b92/attachment.htm>


More information about the asterisk-users mailing list