[asterisk-bugs] [JIRA] (ASTERISK-22563) Realtime database connections dropping

Walter Doekes (JIRA) noreply at issues.asterisk.org
Mon Sep 23 13:06:03 CDT 2013


    [ https://issues.asterisk.org/jira/browse/ASTERISK-22563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=210500#comment-210500 ] 

Walter Doekes commented on ASTERISK-22563:
------------------------------------------

I believe those 5 seconds might stem from this:

http://lists.mysql.com/commits/16150?f=plain
{noformat}
uint 		my_thread_end_wait_time= 5;
{noformat}
 
http://bazaar.launchpad.net/~mysql/myodbc/5.1/view/head:/driver/dll.c#L97

^-- libmyodbc is compiled with {{-DNONTHREADSAFE}}, so the seconds is not set to 0.

Finally, in {{my_end()}}, those seconds are slept.

Try this:
{noformat}
#!/bin/sh
set -e
[ -f "`basename "$0"`" ] || exit 2 # run this from current dir only

echo '[*] Getting dependencies'
if ! dpkg -l libmysqlclient-dev | grep -q ^ii; then
	( sleep 2; yes ) | apt-get install libmysqlclient-dev
fi

echo '[*] Getting myodbc source'
myodbc="`find . -maxdepth 1 -name 'myodbc-*' -type d`"
if [ ! -d "$myodbc" ]; then
	apt-get source libmyodbc
	myodbc="`find . -maxdepth 1 -name 'myodbc-*' -type d`"
	[ ! -d "$myodbc" ] && echo "Fail!" >&2
fi

echo '[*] Entering source'
cd "$myodbc"

echo '[*] Patching source'
if grep -q '=[^#]*-DNONTHREADSAFE' driver/Makefile.in; then
	if ! patch driver/Makefile.in << __EOF__
--- driver/Makefile.in.orig	2012-09-25 16:32:48.282174060 +0200
+++ driver/Makefile.in	2012-09-25 16:32:49.752408865 +0200
@@ -312,7 +312,7 @@
 # Linker and preprocessor options                                 #
 #                                                                 #
 ###################################################################
-libmyodbc3_la_CPPFLAGS = -DNONTHREADSAFE
+libmyodbc3_la_CPPFLAGS = # -DNONTHREADSAFE
 libmyodbc3_la_LDFLAGS = -release @myodbc_version@ -module 
 libmyodbc3_r_la_LDFLAGS = -release @myodbc_version@ -module 
 @WITH_DEBUG_FALSE at LTLIBS_DEPS = ../util/libmyodbc3u.la
__EOF__
	then patch driver/Makefile.in << __EOF__
--- driver/Makefile.in.orig	2009-11-05 22:35:15.000000000 +0100
+++ driver/Makefile.in	2013-09-06 09:20:53.000000000 +0200
@@ -290,7 +290,7 @@
 # Linker and preprocessor options                                 #
 #                                                                 #
 ###################################################################
-libmyodbc5_la_CPPFLAGS = -DNONTHREADSAFE
+libmyodbc5_la_CPPFLAGS = # -DNONTHREADSAFE
 libmyodbc5_la_LDFLAGS = -release @NUMERIC_VERSION@ -module 
 @WITH_DEBUG_FALSE at LTLIBS_DEPS = ../util/libmyodbc3u.la
 @WITH_DEBUG_TRUE at LTLIBS_DEPS = \$(LTLIBS_DEBUG_DEPS)
__EOF__
	fi
fi

echo '[*] Configuring source'
if [ ! -f Makefile ]; then
	./configure --disable-gui
fi

echo '[*] Building package'
make --silent

echo '[*] Installing package'
sudo make install --silent
sudo ldconfig # probably not needed, but doesn't hurt

echo '[*] Fetch shortest lib and repopulate odbcinst.ini (prefer _r version)'
lib="`find /usr/local/lib -name 'libmyodbc*_r.so' | perl -e 'push @x,$_ for(<>);@x=sort{length $a cmp length $b}@x;print "$x[0]";'`"
[ -z "$lib" ] && lib="`find /usr/local/lib -name 'libmyodbc*.so' | perl -e 'push @x,$_ for(<>);@x=sort{length $a cmp length $b}@x;print "$x[0]";'`"
echo '    Make sure your odbcinst.ini contains this:'
echo
printf '[mysql]\n'
printf 'Description\t= MySQL driver\n'
printf 'Driver\t\t= %s\n' "$lib"
printf 'UsageCount\t= 1\n'
printf 'FileCount\t= 1\n'
printf 'Threading\t= 0\n'
echo
{noformat}
                
> Realtime database connections dropping
> --------------------------------------
>
>                 Key: ASTERISK-22563
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-22563
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>    Affects Versions: 11.5.1
>            Reporter: Ben Smithurst
>            Severity: Minor
>         Attachments: odbc.ini, res_odbc.conf, SIGURG-strace.txt
>
>
> We are seeing a periodic problem with our database connections dropping, with a log message along these lines:
> {code}
> [2013-09-20 10:58:24] WARNING[35309] res_odbc.c: SQL Execute returned an error -1: 08S01: [MySQL][ODBC 5.1 Driver][mysqld-5.1.49-1~bpo50+1-log]Lost connection to MySQL server during query (97)
> {code}
> After finally catching it with strace (attached), it appears to be due to some SIGURG signals being received, interrupting the {{read}} system call.
> We are at a bit of a loss to know why this is happening, and are wondering if anyone has seen anything similar and/or can just shed any light on this.
> Our investigation so far has concluded that Asterisk sets up a no-op handler for SIGURG, with SA_RESTART, so the {{read}} should be restarted, and indeed it is, but only 3 times -- after that, the socket is closed with {{shutdown}} and then {{close}} (I believe this is inside libmysqlclient).  If the system call *wasn't* restarted, then it looks from the libmysqlclient source, that it should do so anyway, on an EINTR return value.
> A tcpdump also confirms that Asterisk is sending a query, and then closing the socket before data is even received back from the server - so it is not caused by a bad query.
> There appears to be no easy way to reproduce this, it is *very* intermittent - we are doing some testing of a new server and have various automated tests running 24/7, and on average it happens maybe 2 or 3 times per day.  The problem seems to be often, but not always, on an {{UPDATE sippeers}} query - I am not sure if that is relevant.  We are only using SIP, no other channel types (except Local).
> We are not averse to digging into the code to help solve this problem, however if anyone has any input on how the different threads interact in terms of what SIGURG is being used for, it might help us know where to look.  For example, would it be appropriate to block SIGURG using {{sigprocmask}} or similar while doing realtime database queries, or could that cause other problems?
> Finally, another oddity is that it takes Asterisk 5 seconds to reconnect - compare the timestamps on e.g. lines 19 and 48 of the strace, this appears to consistent, maybe due to the timeout on the futex call?
> Thanks for any input
> (internal ref AST-133)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.asterisk.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



More information about the asterisk-bugs mailing list