[asterisk-bugs] [JIRA] (ASTERISK-25911) IAX Max Retries occasionally

Andreas Krüger (JIRA) noreply at issues.asterisk.org
Mon Apr 11 08:04:56 CDT 2016


Andreas Krüger created ASTERISK-25911:
-----------------------------------------

             Summary: IAX Max Retries occasionally
                 Key: ASTERISK-25911
                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-25911
             Project: Asterisk
          Issue Type: Bug
      Security Level: None
          Components: pjproject/pjsip
    Affects Versions: 13.7.2
         Environment: Ubuntu server
            Reporter: Andreas Krüger
            Severity: Critical


Hi there,

We ran into a problem, when there is some, but not high, load on some of our asterisk servers, we suddenly see an IAX max retries error in the console.
When this happens, everything stops to work and we cannot get asterisk to work again unless we restart the service (not the server).

I tried to start asterisk trough GDB, but since asterisk never crashes, there is nothing to show in gdb about the problem.
I've also sat up a monitoring tool to check for network glitches and neither this has happened.

I've also tried to increase the max retries in chan_iax2.c and recompile asterisk, as I've read on some forums that it should resolve the issue, but this is neither the case.

{code}
sed -i "s/static int max_retries = 4;/static int max_retries = 12;/" channels/chan_iax2.c
{code}

I've attached the output from the console we see. This messages just keeps popping up and seems not to end. This could for me look like theres some cleanup not working in chan_iax2.c when the max retries happens. The error we're facing happens on this line:

https://github.com/asterisk/asterisk/blob/13.7/channels/chan_iax2.c#L3572

I could use some advice to debug this problem further and resolve it, because when this error happens, Asterisk does not work at all until it's get restarted.

The problem is not persistent and I have a hard time to reproduce it. But we see it when the load increases. Doing 10k calls within 7 hour seems to make it happen.
I looked into the code, and see that it uses a reference to a callno, which for me looks like a counter that increases ? - Could we maybe see some sort of race condition or maybe the callno runs out of scope?



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list