[asterisk-bugs] [JIRA] (ASTERISK-23719) Asterisk locks, UDP buffer overflow, 1000+ spawns of 'chan_iax2.c find_idle_thread()'

SteelPivot (JIRA) noreply at issues.asterisk.org
Tue May 6 11:13:43 CDT 2014


    [ https://issues.asterisk.org/jira/browse/ASTERISK-23719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=217950#comment-217950 ] 

SteelPivot commented on ASTERISK-23719:
---------------------------------------

I enabled DEBUG_THREADS because my previous issue was that all IAX2 peers would drop within the same minute (11.2cert, no response to QUALIFY. more specifically, not processing the packets it was receiving). The only fix for this was a restart of Asterisk (I've seen similar bugs reported in the past). I wanted to be able to provide debug info in the event that I needed to open an issue, or perhaps if I was missing something.

I first upgraded to 11.6cert2 (no DEBUG_THREADS) since there were many updates to chan_iax2 from 11.2 to 11.6, but the unreachable IAX2 peers issue persisted (note, I hadn't noticed but also wasn't looking for the large UDP queue issue).

I then recompiled with DEBUG_THREADS and began noticing the UDP queuing issue. In this case I decided that the unreachable peers were probably more an effect of other issues rather than the issue itself, thus my post here.

I will try disabling DEBUG_THREADS and see if the UDP/threading issue persists. Since I won't be able to post locks in that case, I'll do a log_rotate and post the debug log should the issue appear.

> Asterisk locks, UDP buffer overflow, 1000+ spawns of 'chan_iax2.c find_idle_thread()'
> -------------------------------------------------------------------------------------
>
>                 Key: ASTERISK-23719
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-23719
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Channels/chan_iax2
>    Affects Versions: 11.6.1
>         Environment: CentOS 6.4min
>            Reporter: SteelPivot
>            Assignee: SteelPivot
>            Severity: Critical
>         Attachments: 1399319401-core-show-locks.txt, 1399324201-backtrace-threads.txt, 1399324201-core-show-threads.txt, 1399324201-netstat.txt
>
>
> We've been experience an issue for a few months concerning IAX2 peers which has recently gotten more severe after upgrading from 11.2 to 11.6cert2.
> The initial symptom was all (100+) IAX2 peers going UNREACHABLE. However, after inspecting further it seems that what will happen is the UDP queues will sharply increase (seen by netstat -antup), the number of asterisk threads increases (to over 1000 threads in some cases), and Asterisk, of course, stops responding to inbound/outbound calls from any channel (SIP or IAX2).
> After recompiling with DEBUG_THREADS and BETTER_BACKTRACES, I discovered that issuing a "gdb -ex "thread apply all bt"...(etc) " to grab a backtrace will free up the UDP queues, and Asterisk will then become responsive again. Currently I have a script running each 5 minutes that pulls the UDP queues for asterisk processes, and upon seeing a queue above 300,000packets, I issue a "netstat -antup", "core show locks", "core show threads", and "gdb -ex "thread apply all bt" --batch asterisk `pidof asterisk` > $debugdir/$date-backtrace-threads.txt".
> I have previously increased the kernel UDP maximums in sysctl.conf, and added options for iaxthreadcount/iaxmaxthreadcount in iax.conf.
> I cannot repeat this issue at will, but it happens every hour or so (sometimes every few minutes). I have debug logs and backtraces for each occurrence.



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list