[asterisk-bugs] [JIRA] (ASTERISK-23719) Asterisk locks, UDP buffer overflow, 1000+ spawns of 'chan_iax2.c find_idle_thread()'

Matt Jordan (JIRA) noreply at issues.asterisk.org
Tue May 6 10:41:43 CDT 2014


    [ https://issues.asterisk.org/jira/browse/ASTERISK-23719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=217949#comment-217949 ] 

Matt Jordan commented on ASTERISK-23719:
----------------------------------------

The output of {{core show locks}} shows that everything is piled up waiting for this channel lock to be released:

{noformat}
=== Thread ID: 0x7fb6261d9700 (pbx_thread           started at [ 6865] pbx.c ast_pbx_start())
=== ---> Lock #0 (channel.c): MUTEX 3853 __ast_read chan 0x7fb60401f0d0 (1)
	main/logger.c:1599 ast_bt_get_addresses() (0x50a283+1D)
	main/lock.c:218 __ast_pthread_mutex_lock() (0x502454+C9)
	main/astobj2.c:192 __ao2_lock() (0x44c6fa+96)
	main/channel.c:3856 __ast_read()
	main/channel.c:4365 ast_read() (0x47e09a+1D)
	main/channel.c:7576 ast_generic_bridge()
	main/channel.c:8046 ast_channel_bridge() (0x4898a4+19D3)
	main/features.c:4483 ast_bridge_call() (0x4d3ea0+F80)
	apps/app_dial.c:3045 dial_exec_full()
	apps/app_dial.c:3129 dial_exec()
	main/pbx.c:1622 pbx_exec() (0x52bc19+214)
	main/pbx.c:4918 pbx_extension_helper()
	main/pbx.c:6035 ast_spawn_extension() (0x539a1f+65)
	main/pbx.c:6509 __ast_pbx_run()
	main/pbx.c:6840 pbx_thread()
	main/utils.c:1162 dummy_start()
	pthread_create.c:0 start_thread()
	:0 __clone() (0x7fb6982cab00+6D)
{noformat}

There isn't a condition in {{ast_read}} that is causing the lock to be unreleased. This appears to be an issue with {{DEBUG_THREDS}} causing locks to not be released.

Does this issue persist if you disable {{DEBUG_THREADS}}?

> Asterisk locks, UDP buffer overflow, 1000+ spawns of 'chan_iax2.c find_idle_thread()'
> -------------------------------------------------------------------------------------
>
>                 Key: ASTERISK-23719
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-23719
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Channels/chan_iax2
>    Affects Versions: 11.6.1
>         Environment: CentOS 6.4min
>            Reporter: SteelPivot
>            Severity: Critical
>         Attachments: 1399319401-core-show-locks.txt, 1399324201-backtrace-threads.txt, 1399324201-core-show-threads.txt, 1399324201-netstat.txt
>
>
> We've been experience an issue for a few months concerning IAX2 peers which has recently gotten more severe after upgrading from 11.2 to 11.6cert2.
> The initial symptom was all (100+) IAX2 peers going UNREACHABLE. However, after inspecting further it seems that what will happen is the UDP queues will sharply increase (seen by netstat -antup), the number of asterisk threads increases (to over 1000 threads in some cases), and Asterisk, of course, stops responding to inbound/outbound calls from any channel (SIP or IAX2).
> After recompiling with DEBUG_THREADS and BETTER_BACKTRACES, I discovered that issuing a "gdb -ex "thread apply all bt"...(etc) " to grab a backtrace will free up the UDP queues, and Asterisk will then become responsive again. Currently I have a script running each 5 minutes that pulls the UDP queues for asterisk processes, and upon seeing a queue above 300,000packets, I issue a "netstat -antup", "core show locks", "core show threads", and "gdb -ex "thread apply all bt" --batch asterisk `pidof asterisk` > $debugdir/$date-backtrace-threads.txt".
> I have previously increased the kernel UDP maximums in sysctl.conf, and added options for iaxthreadcount/iaxmaxthreadcount in iax.conf.
> I cannot repeat this issue at will, but it happens every hour or so (sometimes every few minutes). I have debug logs and backtraces for each occurrence.



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list