[asterisk-bugs] [JIRA] (ASTERISK-27615) Dialplan deadlock when connection to external SQL server is lost

Richard Mudgett (JIRA) noreply at issues.asterisk.org
Thu Feb 8 09:35:13 CST 2018


    [ https://issues.asterisk.org/jira/browse/ASTERISK-27615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=242080#comment-242080 ] 

Richard Mudgett commented on ASTERISK-27615:
--------------------------------------------

Batch mode just shifts the database access to another thread to write the CDR records.  The ODBC functions need to access the database right then so they would still have to block waiting for a result.

>From my reading about {{SQL_ATTR_CONNECTION_DEAD}}, it is not supposed to query or ping the database but simply return the last known connection status.  I think this is a bug in the database connector library and is outside asterisk's control.

> Dialplan deadlock when connection to external SQL server is lost
> ----------------------------------------------------------------
>
>                 Key: ASTERISK-27615
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-27615
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: CDR/cdr_adaptive_odbc, Functions/func_cdr
>    Affects Versions: 14.6.0
>            Reporter: Jared Hull
>            Assignee: Unassigned
>            Severity: Critical
>         Attachments: core-asterisk-running-2018-01-26T17-11-24+0000-brief.txt, core-asterisk-running-2018-01-26T17-11-24+0000-full.txt, core-asterisk-running-2018-01-26T17-11-24+0000-locks.txt, core-asterisk-running-2018-01-26T17-11-24+0000-thread1.txt, core-asterisk-running-2018-01-26T20-21-53+0000-brief.txt, core-asterisk-running-2018-01-26T20-21-53+0000-full.txt, core-asterisk-running-2018-01-26T20-21-53+0000-locks.txt, core-asterisk-running-2018-01-26T20-21-53+0000-thread1.txt
>
>
> We have a cluster of SQL servers for CDR and realtime states that are used in dialplan. Recently we had a single SQL server lose network connectivity, and all Asterisk instances which used this server as their primary started to hang in dialplan.
> If I stop the SQL service while leaving the server pingable, Asterisk will continue to work and simply return a few errors when the CDR is committed to the database.
> {code}
> res_odbc.c:962 odbc_obj_connect: res_odbc: Error SQLConnect=-1 errno=2003 [unixODBC][MySQL][ODBC 5.2(w) Driver]Can't connect to MySQL server on 'dev-dallas-sql1
> cdr_adaptive_odbc.c:436 odbc_log: cdr_adaptive_odbc: Unable to retrieve database handle for 'dev-dallas-sql1:cdr_event_log'.  CDR failed: INSERT INTO cdr_event_log (
> {code}
> If I 'service network stop' on the SQL server to simulate network failure, asterisk stops executing dialplan related to func_odbc and cdr_adaptive_odbc. It is as if it still thinks the SQL connection is there, and refuses to failover to another DSN in the case of func_odbc. cdr_adaptive_odbc doesn't even have failover connections (this would be a very useful feature) so I don't know what can be done about that, other than to skip CDR and throw an error.
> Dialplan to reproduce:
> {code}
> exten => 101,1,noop(${CDR(anything)})
> exten => 102,1,noop(${ODBC_blacklist_global(42)})
> {code}
> Example of cdr_adaptive_odbc.conf entry:
> {code}
> [default]
> connection=dev-dallas-sql1
> table=cdr_event_log
> {code}
> Example of func_odbc.conf entry:
> {code}
> [blacklist_global]
> dsn=dev-dallas-sql1,dev-dallas-sql2,dev-dallas-sql3
> readsql=SELECT COUNT(*) FROM blacklist_global WHERE cid_number='${SQL_ESC(${ARG1})}'
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list