[asterisk-bugs] [JIRA] (ASTERISK-25046) Chan_sip deadlock persistent between Asterisk 1.8 versions, using Realtime.
Maciej Maurycy Bonin (JIRA)
noreply at issues.asterisk.org
Fri May 1 11:00:33 CDT 2015
Maciej Maurycy Bonin created ASTERISK-25046:
-----------------------------------------------
Summary: Chan_sip deadlock persistent between Asterisk 1.8 versions, using Realtime.
Key: ASTERISK-25046
URL: https://issues.asterisk.org/jira/browse/ASTERISK-25046
Project: Asterisk
Issue Type: Bug
Security Level: None
Components: Addons/res_config_mysql, Applications/app_meetme, Applications/app_mixmonitor, Channels/chan_dahdi, Channels/chan_iax2, Channels/chan_sip/General, Resources/res_timing_dahdi, Resources/res_timing_pthread, Resources/res_timing_timerfd
Affects Versions: 1.8.32.3, 1.8.23.1
Environment: Happens both on CentOS5 and CentOS6, across different asterisk versions up until the latest 1.8 release.
Reporter: Maciej Maurycy Bonin
We've been having intermittent issues with chan_sip - it stops responding to cli requests, trying to reload chan_sip from cli doesn't seem to have any effect, initiated calls carry on for a short period, but no new SIP requests are processed ('sip show channels' hangs forever, server stops responding to SIP OPTIONS, or any other SIP messages). We have updated the build from 1.8.23.1 to the latest asterisk 1.8 (1.8.32.3), however the problem still persists. We have gathered debugging information from 'core show locks' and from gdb, attached to this message (with phone numbers and extension and context names obscured). We are running realtime under CentOS 6.6, built from source and packaged using rpmbuild, with the following menuselect options (debugging version):
menuselect/menuselect --disable BUILD_NATIVE --enable DEBUG_THREADS --enable DONT_OPTIMIZE --disable CORE-SOUNDS-EN-GSM --disable-category MENUSELECT_EXTRA_SOUNDS --disable MOH-OPSOUND-WAV --enable-category MENUSELECT_ADDONS --disable format_mp3 --disable cdr_tds --disable cel_tds --disable cdr_pgsql --disable cel_pgsql --disable res_config_pgsql menuselect.makeopts
under kernel 2.6.32-504.el6.x86_64, and linked against the following library versions:
/usr/lib64/libssl.so.10: symbolic link to `libssl.so.1.0.1e'
/usr/lib64/libcrypto.so.10: symbolic link to `libcrypto.so.1.0.1e'
/lib64/libc.so.6: symbolic link to `libc-2.12.so'
/usr/lib64/libxml2.so.2: symbolic link to `libxml2.so.2.7.6'
/lib64/libz.so.1: symbolic link to `libz.so.1.2.3'
/lib64/libm.so.6: symbolic link to `libm-2.12.so'
/lib64/libdl.so.2: symbolic link to `libdl-2.12.so'
/lib64/libpthread.so.0: symbolic link to `libpthread-2.12.so'
/lib64/libtinfo.so.5: symbolic link to `libtinfo.so.5.7'
/lib64/libresolv.so.2: symbolic link to `libresolv-2.12.so'
/lib64/libgssapi_krb5.so.2: symbolic link to `libgssapi_krb5.so.2.2'
/lib64/libkrb5.so.3: symbolic link to `libkrb5.so.3.3'
/lib64/libcom_err.so.2: symbolic link to `libcom_err.so.2.1'
/lib64/libk5crypto.so.3: symbolic link to `libk5crypto.so.3.1'
/lib64/libkrb5support.so.0: symbolic link to `libkrb5support.so.0.1'
/lib64/libkeyutils.so.1: symbolic link to `libkeyutils.so.1.3'
Since my original message to the asterisk-users mailing list we have also tried disabling res_timing_pthread and replacing it with res_timing_timerfd - that, too, has not stopped the deadlock from reoccurring. Finally, we have disabled tcp entirely, this seems to have made no difference either.
We still have not been able to find the exact steps to reproduce, it seems to happen at different times of day, under varying load, but usually during hours of heaviest usage.
We have now emptied the server, but we have several more backtraces available, collected at times of several of the latest deadlocks, if the one attached is not sufficient.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
More information about the asterisk-bugs
mailing list