[Asterisk-Dev] Deadlock ? app_queue / chan_agent

NRB nrb at dantronic-engineering.dk
Sat Feb 5 03:49:03 MST 2005


Hi All

We were planning to roll out * to 4-5 five of our offices, all with single or quad E1 termination ,  Unfortunately  we have ran in to a real showstopper with our setup in the first office :-(

Setup:
Proliant DL380 G4, xeon 3.2 GHz, 1 G RAM,  3*72 GB HD RAID5, Whitebox Linux 3.0 respin 1
Digium E100p
30-40 users in 5 different queues
* stable 17.11.04


The problem:
Incomming calls fram the E1 inteface is directed to the different queues according to dialplan. Suddenly the calls entering the queues stops being forwarded to available agents and new customers does not get Moh when entering the queues (but is still being queued). 
Customers hanging up on us after being tired of waiting in this state  leaves open Zap/E1 channels (untill there is no more free E1 channels and * totally stops responding )- Only way to get things going again is a "killall"
Apparently * keeps all other services allive until all free zap/E1 channels are "taken" 
The problem occurs at random time of the day, and random day ofthe week: i.e * runs for 2-3 days or just a couple of hours before we have to do a restart.
We have not been able to reproduce the problem on our testserver , and running cvs-head is not an option 

Could this be connected to this: http://lists.digium.com/pipermail/asterisk-dev/2005-January/008368.html

What do you say?

/Nrb


>From gdb we've got:

(gdb) info threads
  56 Thread -1222247504 (LWP 26892)  0xb74bef3d in poll () from /lib/tls/libc.so.6
  55 Thread -1224365136 (LWP 26893)  0xb75d6b4e in accept () from /lib/tls/libpthread.so.0
  54 Thread -1226507344 (LWP 26894)  0xb74c13e7 in ___newselect_nocancel () from /lib/tls/libc.so.6
  53 Thread -1228661840 (LWP 26895)  0xb74c13e7 in ___newselect_nocancel () from /lib/tls/libc.so.6
  52 Thread -1230804048 (LWP 26896)  0xb75d6951 in __read_nocancel () from /lib/tls/libpthread.so.0
  51 Thread -1233290320 (LWP 26899)  0xb74bef3d in poll () from /lib/tls/libc.so.6
  50 Thread -1235559504 (LWP 26900)  0xb74bef3d in poll () from /lib/tls/libc.so.6
  49 Thread -1238996048 (LWP 26901)  0xb74bef3d in poll () from /lib/tls/libc.so.6
  48 Thread -1241183312 (LWP 26902)  0xb74c13e7 in ___newselect_nocancel () from /lib/tls/libc.so.6
  47 Thread -1243800656 (LWP 26903)  0xb74bef3d in poll () from /lib/tls/libc.so.6
  46 Thread -1245901904 (LWP 26904)  0xb74bef3d in poll () from /lib/tls/libc.so.6
  45 Thread -1248052304 (LWP 26905)  0xb74bef3d in poll () from /lib/tls/libc.so.6
  44 Thread -1250169936 (LWP 26906)  0xb749550c in __nanosleep_nocancel () from /lib/tls/libc.so.6
  43 Thread -1252533328 (LWP 26907)  0xb749550c in __nanosleep_nocancel () from /lib/tls/libc.so.6
  42 Thread -1271407696 (LWP 26929)  0xb74bef3d in poll () from /lib/tls/libc.so.6
  41 Thread -1275688016 (LWP 26930)  0xb74bef3d in poll () from /lib/tls/libc.so.6
  40 Thread -1283466320 (LWP 26935)  0xb74bef3d in poll () from /lib/tls/libc.so.6
  39 Thread -1269306448 (LWP 27620)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  38 Thread -1302377552 (LWP 27630)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  37 Thread -1256698960 (LWP 27637)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  36 Thread -1291871312 (LWP 27639)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  35 Thread -1265103952 (LWP 27645)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  34 Thread -1285567568 (LWP 27664)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  33 Thread -1287668816 (LWP 27666)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  32 Thread -1298175056 (LWP 27669)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  31 Thread -1263002704 (LWP 27672)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  30 Thread -1260901456 (LWP 27675)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  29 Thread -1300276304 (LWP 27678)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  28 Thread -1289770064 (LWP 27681)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  27 Thread -1279263824 (LWP 27684)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  26 Thread -1293972560 (LWP 27687)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  25 Thread -1308681296 (LWP 27690)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  24 Thread -1314985040 (LWP 27693)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  23 Thread -1321288784 (LWP 27696)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  22 Thread -1327592528 (LWP 27699)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  21 Thread -1333896272 (LWP 27702)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  20 Thread -1340200016 (LWP 27705)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
  19 Thread -1346503760 (LWP 27708)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  18 Thread -1352807504 (LWP 27711)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  17 Thread -1359111248 (LWP 27714)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  16 Thread -1365414992 (LWP 27717)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  15 Thread -1367516240 (LWP 27720)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  14 Thread -1296073808 (LWP 27721)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  13 Thread -1361212496 (LWP 27724)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  12 Thread -1350706256 (LWP 27727)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  11 Thread -1354908752 (LWP 27728)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  10 Thread -1258800208 (LWP 27731)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  9 Thread -1363313744 (LWP 27732)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  8 Thread -1344402512 (LWP 27735)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  7 Thread -1357010000 (LWP 27739)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  6 Thread -1342301264 (LWP 27740)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  5 Thread -1267205200 (LWP 27742)  0xb74bef3d in poll () from /lib/tls/libc.so.6
  4 Thread -1348605008 (LWP 27745)  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
  3 Thread -1338098768 (LWP 27746)  0xb74bef3d in poll () from /lib/tls/libc.so.6
  2 Thread -1335997520 (LWP 27747)  0xb74bef3d in poll () from /lib/tls/libc.so.6
  1 Thread -1222246272 (LWP 26890)  0xb75d6951 in __read_nocancel () from /lib/tls/libpthread.so.0
(gdb) thread apply all bt

Thread 56 (Thread -1222247504 (LWP 26892)):
#0  0xb74bef3d in poll () from /lib/tls/libc.so.6
#1  0x0809eeed in listener (unused=0x0) at asterisk.c:333
#2  0xb75d1dfc in start_thread () from /lib/tls/libpthread.so.0
#3  0xb74c7f2a in clone () from /lib/tls/libc.so.6

Thread 55 (Thread -1224365136 (LWP 26893)):
#0  0xb75d6b4e in accept () from /lib/tls/libpthread.so.0
#1  0x08097dc6 in accept_thread (ignore=0x0) at manager.c:1225
#2  0xb75d1dfc in start_thread () from /lib/tls/libpthread.so.0
#3  0xb74c7f2a in clone () from /lib/tls/libc.so.6

Thread 54 (Thread -1226507344 (LWP 26894)):
#0  0xb74c13e7 in ___newselect_nocancel () from /lib/tls/libc.so.6
#1  0xb6e571e9 in do_monitor (data=0x0) at channel.h:843
#2  0xb75d1dfc in start_thread () from /lib/tls/libpthread.so.0
#3  0xb74c7f2a in clone () from /lib/tls/libc.so.6

Thread 53 (Thread -1228661840 (LWP 26895)):
#0  0xb74c13e7 in ___newselect_nocancel () from /lib/tls/libc.so.6
#1  0xb6c45c26 in do_parking_thread (ignore=0x0) at channel.h:843
#2  0xb75d1dfc in start_thread () from /lib/tls/libpthread.so.0
#3  0xb74c7f2a in clone () from /lib/tls/libc.so.6

Thread 52 (Thread -1230804048 (LWP 26896)):
#0  0xb75d6951 in __read_nocancel () from /lib/tls/libpthread.so.0
#1  0xb6a3941d in monmp3thread (data=0x8105040) at res_musiconhold.c:282
#2  0xb75d1dfc in start_thread () from /lib/tls/libpthread.so.0
#3  0xb74c7f2a in clone () from /lib/tls/libc.so.6

Thread 51 (Thread -1233290320 (LWP 26899)):
#0  0xb74bef3d in poll () from /lib/tls/libc.so.6
#1  0x08052284 in ast_io_wait (ioc=0x81087d8, howlong=-4) at io.c:254
#2  0xb67ffb55 in do_monitor (data=0x0) at chan_sip.c:7838
#3  0xb75d1dfc in start_thread () from /lib/tls/libpthread.so.0
#4  0xb74c7f2a in clone () from /lib/tls/libc.so.6
---Type <return> to continue, or q <return> to quit---

Thread 50 (Thread -1235559504 (LWP 26900)):
#0  0xb74bef3d in poll () from /lib/tls/libc.so.6
#1  0x08052284 in ast_io_wait (ioc=0x810c4d8, howlong=-4) at io.c:254
#2  0xb65b656b in do_monitor (data=0x0) at chan_mgcp.c:3298
#3  0xb75d1dfc in start_thread () from /lib/tls/libpthread.so.0
#4  0xb74c7f2a in clone () from /lib/tls/libc.so.6

Thread 49 (Thread -1238996048 (LWP 26901)):
#0  0xb74bef3d in poll () from /lib/tls/libc.so.6
#1  0x08052284 in ast_io_wait (ioc=0x810cee0, howlong=-4) at io.c:254
#2  0xb6270312 in network_thread (ignore=0x0) at chan_iax2.c:6335
#3  0xb75d1dfc in start_thread () from /lib/tls/libpthread.so.0
#4  0xb74c7f2a in clone () from /lib/tls/libc.so.6

Thread 48 (Thread -1241183312 (LWP 26902)):
#0  0xb74c13e7 in ___newselect_nocancel () from /lib/tls/libc.so.6
#1  0xb60532c3 in do_monitor (data=0x0) at channel.h:843
#2  0xb75d1dfc in start_thread () from /lib/tls/libpthread.so.0
#3  0xb74c7f2a in clone () from /lib/tls/libc.so.6

Thread 47 (Thread -1243800656 (LWP 26903)):
#0  0xb74bef3d in poll () from /lib/tls/libc.so.6
#1  0xb5e28404 in pri_dchannel (vpri=0xb5e371a0) at chan_zap.c:7358
#2  0xb75d1dfc in start_thread () from /lib/tls/libpthread.so.0
#3  0xb74c7f2a in clone () from /lib/tls/libc.so.6

Thread 46 (Thread -1245901904 (LWP 26904)):
#0  0xb74bef3d in poll () from /lib/tls/libc.so.6
#1  0xb5e1a3ed in do_monitor (data=0x0) at chan_zap.c:5779
#2  0xb75d1dfc in start_thread () from /lib/tls/libpthread.so.0
#3  0xb74c7f2a in clone () from /lib/tls/libc.so.6

Thread 45 (Thread -1248052304 (LWP 26905)):
#0  0xb74bef3d in poll () from /lib/tls/libc.so.6
#1  0xb59c5a12 in autodial (ignore=0x0) at pbx_wilcalu.c:83
#2  0xb75d1dfc in start_thread () from /lib/tls/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#3  0xb74c7f2a in clone () from /lib/tls/libc.so.6

Thread 44 (Thread -1250169936 (LWP 26906)):
#0  0xb749550c in __nanosleep_nocancel () from /lib/tls/libc.so.6
#1  0xb749532f in sleep () from /lib/tls/libc.so.6
#2  0xb57c021c in scan_thread (unused=0x0) at pbx_spool.c:325
#3  0xb75d1dfc in start_thread () from /lib/tls/libpthread.so.0
#4  0xb74c7f2a in clone () from /lib/tls/libc.so.6

Thread 43 (Thread -1252533328 (LWP 26907)):
#0  0xb749550c in __nanosleep_nocancel () from /lib/tls/libc.so.6
#1  0xb749532f in sleep () from /lib/tls/libc.so.6
#2  0xb5580075 in qcall (ignore=0x0) at app_qcall.c:167
#3  0xb75d1dfc in start_thread () from /lib/tls/libpthread.so.0
#4  0xb74c7f2a in clone () from /lib/tls/libc.so.6

Thread 42 (Thread -1271407696 (LWP 26929)):
#0  0xb74bef3d in poll () from /lib/tls/libc.so.6
#1  0x08099a55 in get_input (s=0x80fb670, output=0xb4378a74 "Action: Login") at manager.c:1154
#2  0x080985d3 in session_do (data=0x80fb670) at manager.c:1181
#3  0xb75d1dfc in start_thread () from /lib/tls/libpthread.so.0
#4  0xb74c7f2a in clone () from /lib/tls/libc.so.6

Thread 41 (Thread -1275688016 (LWP 26930)):
#0  0xb74bef3d in poll () from /lib/tls/libc.so.6
#1  0x08099a55 in get_input (s=0x81cde10, output=0xb3f63a74 "Action: Originate") at manager.c:1154
#2  0x080985d3 in session_do (data=0x81cde10) at manager.c:1181
#3  0xb75d1dfc in start_thread () from /lib/tls/libpthread.so.0
#4  0xb74c7f2a in clone () from /lib/tls/libc.so.6

Thread 40 (Thread -1283466320 (LWP 26935)):
#0  0xb74bef3d in poll () from /lib/tls/libc.so.6
#1  0x0805a89f in ast_waitfor_nandfds (c=0xb37fd680, n=0, fds=0x0, nfds=0, exception=0x0, outfd=0x0, ms=0xb37fd67c)
    at channel.c:1006
#2  0x0806047f in ast_waitfor_n (c=0xfffffffc, n=-4, ms=0xfffffffc) at channel.c:1075
#3  0x080a6c70 in autoservice_run (ign=0x0) at autoservice.c:76
#4  0xb75d1dfc in start_thread () from /lib/tls/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#5  0xb74c7f2a in clone () from /lib/tls/libc.so.6

Thread 39 (Thread -1269306448 (LWP 27620)):
#0  0xb75d66f1 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
#1  0xb75d37b0 in _L_mutex_lock_78 () from /lib/tls/libpthread.so.0
#2  0xb65c988c in ?? () from /usr/lib/asterisk/modules/chan_agent.so
#3  0x0810fc38 in ?? ()
#4  0xb4579f18 in ?? ()
#5  0xb65c462e in agent_new (p=0x810fc38, state=-4) at chan_agent.c:764
Previous frame identical to this frame (corrupt stack?)
#0  0xb75d6951 in __read_nocancel () from /lib/tls/libpthread.so.0




More information about the asterisk-dev mailing list