[asterisk-users] 1.4.20.1 hang -- extra info + gdb hangs

Ex Vito ex.vitorino at gmail.com
Wed Jun 11 05:23:16 CDT 2008


  Here is an update,

  1. Reviewed 'core show locks' with the help of russellb @  #asterisk-devs
      last friday

  2. Recommended recompilling asterisk with DONT_OPTIMIZE and
      getting a stack trace with:
      # gdb /usr/sbin/asterisk $(pidof asterisk)
      (gdb) set pagination off
      (gdb) thread apply all bt

  We did reinstall asterisk with the new compile flags back then and just
  experienced another hang now (weekend, monday and tuesday
  were very low activity days).

  Unfortunatelly, gdb seems to hang on startup, after what seems to be a
  thread list. It never gets to the "reading symbols from..." steps. As such,
  no gdb prompt -> no stack trace ! :-/

  ps shows gdb process as <defunct> and, as such, it responds to no signals;
  asterisk seems to not respond to signals as well... (maybe that's why gdb
  hangs... I really do not know how gdb works in regards to attaching itself
  to a running process)

  Again we have a 'core show locks' + 'core show threads' output from asterisk
  which we have no skills to read...

  Lastly, asterisk log displays 12x...

[Jun 11 09:41:07] ERROR[4837] chan_sip.c: SIP transaction failed:
588233f5261d52ac621587ca327b5083 at 192.168.161.40
[Jun 11 09:41:07] ERROR[4837] chan_sip.c: We could NOT get the channel
lock for SIP/000e08de4cbe-097555c8!

  ...then...

[Jun 11 09:41:19] WARNING[4837] chan_sip.c: Maximum retries exceeded
on transmission 588233f5261d52ac621587ca327b5083 at 192.168.161.40 for
seqno 102 (Critical Request)

  ...and finally about 1200 of these:

[Jun 11 09:42:59] WARNING[4842] chan_iax2.c: Max retries exceeded to
host 192.168.166.40 on IAX2/private-13779 (type = 6, subclass = 11,
ts=40022, seqno=10)

  ...with several "combinations" of:
  - the number inside WARNING[xxx] -> 13 different
  - the host IP: 192.168.166.40 and 192.168.170.40
  - the iax channel -> 12 different


  Till today, our gut feelings were:

  1. The TC400B installation / usage change
      (idea: asterisk responds to no signals because it is waiting in
kernel space,
       maybe something's wrong with zaptel, wctc4xxp, our HW ?)

  2. The activation of a voicemail account with MWI

  We now have an extra possibility:

  - This system exchanges IAX calls with several other systems
  - The hanging one is running asterisk 1.4.20.1, but all the others
    are running 1.4.19
  - The changelog from 1.4.19 -> 1.4.20.1 includes several chan_iax
     fixes --> could the absense of such fixes in this system's iax peers
     be leading it to hang ?

  Possibility:
  3. Upgrade all peers to 1.4.20.1


  Again, if anyone can chime in with their contribution, thanks in advance.

  Question of the day: why on earth does gdb hang ?! (our guess: because
  asterisk does not respond to signals... now why ?!)


  Cheers,
--
 exvito



More information about the asterisk-users mailing list