[asterisk-bugs] [JIRA] (ASTERISK-26310) Crash occurs with backtrace log showing fault related to pjsip hash every 24 - 48 hours

Gaston Mendez (JIRA) noreply at issues.asterisk.org
Sat Aug 20 13:41:56 CDT 2016


Gaston Mendez created ASTERISK-26310:
----------------------------------------

             Summary: Crash occurs with backtrace log showing fault related to pjsip hash every 24 - 48 hours
                 Key: ASTERISK-26310
                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-26310
             Project: Asterisk
          Issue Type: Bug
      Security Level: None
          Components: pjproject/pjsip
    Affects Versions: 13.10.0, 13.11.0
         Environment: Asterisk 13.10.0 running on fully updated Centos 7 linux 64bit. We also have a second backtrace showing the same ../src/pj/hash.c:181 in the (gdb) bt output from a second asterisk server running Asterisk 13.11.0-rc1 so we think we are crashing the same way across the 2 latest versions of asterisk 13.
            Reporter: Gaston Mendez
            Severity: Critical


We are trying to put an Asterisk 13 server into production. First time using pjsip as well. When we get to a loaded beta of 20 active calls we are experiencing crashes unpredictably and without a visible error or commonality between crashes. It is not load dependent because we have seen it crash at low points during the day with literally 1 - 2 active calls running during the crash. The only thing that's certain is that after steady load of every day use in 2 week beta we know it will crash every 48 hours, and more like every 24 hours. It will crash with no visible error or complaint in asterisk messages or full logs which are very clean and quiet logs. The coredump shows it citing line 181 of ../src/pj/hash.c and the only known commonality we have between crashes is that we have at least 2 backtraces on 2 different servers citing this same line of code in the back trace (gdb bt) like this:

#0  find_entry (lower=0, entry_buf=0x0, hval=0x7f52cc5412cc, val=0x0, keylen=258, key=0x7f52cc541310, ht=<optimized out>, pool=0x0) at ../src/pj/hash.c:181

181		if (entry->hash==hash && entry->keylen==keylen &&

It seems there is some instability we must be triggering in pjsip/asterisk. We are not doing anything outside the norm of what we've done on old versions of asterisk. Asterisk throws no message errors at any time, and other than this once a day crash, asterisk 13 is running very clean and high performing with no other complaint at all. We have reason to believe this is some asterisk/pjsip bug we have triggered. There are no exact steps to trigger it. It seems as long as there is at least 1 active call it can happen. It also happens about once every 24-48 hours for a span of 2 weeks. So the only way to 'reproduce' it is to wait 48 hours as we have been. We have multiple backtraces and are attaching 2 that show the same exact source code file and line number. As stated in the environment section we are crashing across 2 servers, the second being identical centos 7 fully yum updated 64 bit linux with the second server running Asterisk 13.11.0-rc1. We will attach everything we have from both servers and file it as a bug report and hope we can stabilize the system asap.



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list