[asterisk-bugs] [JIRA] (ASTERISK-25911) chan_iax2: IAX Max Retries - hung IAX channels in Ring state - cannot clear channels until Asterisk restart

Jeppe Ryskov Larsen (JIRA) noreply at issues.asterisk.org
Thu Aug 25 07:58:56 CDT 2016


    [ https://issues.asterisk.org/jira/browse/ASTERISK-25911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=231955#comment-231955 ] 

Jeppe Ryskov Larsen commented on ASTERISK-25911:
------------------------------------------------

Today we have been testing out some changes, both on our 13.9.0 and our 13.11.0-rc1 environments.

We noticed from our logs that a lot of the times when we have gotten the "max retries"-stall, some or all of the active channels would be stuck in the NoOp application. Another thing that we read somewhere was to turn off/minimize logging through logger.conf. I am starting to think something not related to iax2 is causing the issue, causing the iax2 to behave oddly. 

So we removed all of our NoOps from our dialplan and now only logs verbose,error,warning instead of verbose,error,warning,debug,fax,dtmf 

On a normal day at this point (14:53, clients starts using the system at 08:00) we would have gotten the error at least once or twice, but we haven't seen it today, so far. 

I will report back if this is bogus and we see it again. After all, these are just random and desperate attempts at fixing this.

> chan_iax2: IAX Max Retries - hung IAX channels in Ring state - cannot clear channels until Asterisk restart
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: ASTERISK-25911
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-25911
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Channels/chan_iax2
>    Affects Versions: 13.7.2, 13.9.0, 13.9.1
>         Environment: Ubuntu server
>            Reporter: Andreas Krüger
>            Assignee: Unassigned
>         Attachments: 01-08-2016-backtrace-threads.txt, 01-08-2016-core-show-channels-infos.txt, 01-08-2016-core-show-channels.txt, 01-08-2016-core-show-locks.txt, 01-08-2016-iax2-show-channels.txt, 01-08-2016-iax2-show-netstats.txt, 2016-06-15-backtrace-threads.txt, 2016-06-15-core-show-channels-infos.txt, 2016-06-15-core-show-channels.txt, 2016-06-15-core-show-locks.txt, 2016-06-15-iax2-show-channels.txt, 2016-06-15-iax2-show-netstats.txt, 2016-06-16-backtrace-threads.txt, 2016-06-16-core-show-channels-infos.txt, 2016-06-16-core-show-channels.txt, 2016-06-16-core-show-locks.txt, 2016-06-16-iax2-show-channels.txt, 2016-06-16-iax2-show-netstats.txt, 2016-06-17-backtrace-threads.txt, 2016-06-17-core-show-channels-infos.txt, 2016-06-17-core-show-channels.txt, 2016-06-17-core-show-locks.txt, 2016-06-17-iax2-show-channels.txt, 2016-06-17-iax2-show-netstats.txt, 23-08-2016-backtrace-threads.txt, 23-08-2016-cli-output-full.txt, 23-08-2016-core-show-channels-infos.txt, 23-08-2016-core-show-channels.txt, 23-08-2016-core-show-locks.txt, 23-08-2016-iax2-show-channels.txt, 23-08-2016-iax2-show-netstats.txt, backtrace-threads.txt, core-show-channels-infos.txt, core-show-channels.txt, debug_log_25911_odn1-voip-cluster02-asterisk01, debug_log_25911_odn1-voip-cluster02-upstream01, iax2-show-channels.txt, iax2-show-netstats.txt, iax.conf, upload (1).png
>
>
> Hi there,
> We ran into a problem, when there is some, but not high, load on some of our asterisk servers, we suddenly see an IAX max retries error in the console.
> When this happens, everything stops to work and we cannot get asterisk to work again unless we restart the service (not the server).
> I tried to start asterisk trough GDB, but since asterisk never crashes, there is nothing to show in gdb about the problem.
> I've also sat up a monitoring tool to check for network glitches and neither this has happened.
> I've also tried to increase the max retries in chan_iax2.c and recompile asterisk, as I've read on some forums that it should resolve the issue, but this is neither the case.
> {code}
> sed -i "s/static int max_retries = 4;/static int max_retries = 12;/" channels/chan_iax2.c
> {code}
> I've attached the output from the console we see. This messages just keeps popping up and seems not to end. This could for me look like theres some cleanup not working in chan_iax2.c when the max retries happens. The error we're facing happens on this line:
> https://github.com/asterisk/asterisk/blob/13.7/channels/chan_iax2.c#L3572
> I could use some advice to debug this problem further and resolve it, because when this error happens, Asterisk does not work at all until it's get restarted.
> The problem is not persistent and I have a hard time to reproduce it. But we see it when the load increases. Doing 10k calls within 7 hour seems to make it happen.
> I looked into the code, and see that it uses a reference to a callno, which for me looks like a counter that increases ? - Could we maybe see some sort of race condition or maybe the callno runs out of scope?



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list