[Asterisk-Dev] CVS-HEAD 08/13/2005 SIP DEADLOCKING
Sherwood McGowan
madprofzero at yahoo.com
Wed Aug 24 01:31:46 MST 2005
Brian mentions the problems almost exactly as we see them occuring on our
servers.
Sometimes it just locks up, partially stopping all traffic on SIP, but still
taking *SOME* commands...
Here's backtraces of the processes in question....
Starting with highest CPU usage
Process Num: 32069
running info thread, thread apply all bt, if no output, just using bt
(gdb) bt
#0 0x0011a094 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
#1 0x00119a28 in __pthread_wait_for_restart_signal ()
from /lib/i686/libpthread.so.0
#2 0x0011b468 in __pthread_lock () from /lib/i686/libpthread.so.0
#3 0x001182d6 in pthread_mutex_lock () from /lib/i686/libpthread.so.0
#4 0x003942c2 in find_peer (peer=0xb4c2a0 "17048372334", sin=0x0,
realtime=1)
at chan_sip.c:1546
#5 0x003945c3 in create_addr (r=0xb745c7a0, opeer=0xb4dd30 "17048372334")
at chan_sip.c:1655
#6 0x003a635b in sip_send_mwi_to_peer (peer=0xb7422ad8) at chan_sip.c:9720
#7 0x0039e1f7 in do_monitor (data=0x0) at chan_sip.c:9851
#8 0x00117e21 in pthread_start_thread () from /lib/i686/libpthread.so.0
#9 0x004d8b1a in clone () from /lib/i686/libc.so.6
Then moving to second highest CPU usage:
Process Num: 24639
running info thread, thread apply all bt, if no output, just using bt
(gdb) bt
#0 0x004cfb7a in poll () from /lib/i686/libc.so.6
#1 0x0805ecde in ast_waitfor_nandfds (c=0x3582ea0, n=2, fds=0x0, nfds=0,
exception=0x0, outfd=0x0, ms=0x3582e94) at channel.c:1259
#2 0x080662d5 in ast_generic_bridge (playitagain=0x3582f30,
playit=0x3582f34,
start_time=0x3582f58, c0=0xb74dd688, c1=0x8566b80, config=0x3583600,
fo=0x3582fe0, rc=0x3582fe4) at channel.c:1334
#3 0x0806378c in ast_channel_bridge (c0=0xb74dd688, c1=0x8566b80,
config=0x3583600, fo=0x3582fe0, rc=0x3582fe4) at channel.c:3195
#4 0x007d1969 in ast_bridge_call (chan=0xb74dd688, peer=0x8566b80,
config=0x3583600) at res_features.c:1161
#5 0x0079cac9 in dial_exec_full (chan=0xb74dd688, data=0x3583600,
peerflags=0x3583958) at app_dial.c:1335
#6 0x0079aa57 in dial_exec (chan=0xfffffffc, data=0xfffffffc)
at app_dial.c:1375
#7 0x08088d4f in pbx_extension_helper (c=0xb74dd688, con=0xfffffffc,
context=0xb74dd7d8 "incoming", exten=0xb74dd8cc "_+1NXXNXXXXXX",
priority=30, label=0x0, callerid=0x83e3f30 "Dial", action=0) at
pbx.c:547
#8 0x08089954 in __ast_pbx_run (c=0xb74dd688) at pbx.c:2144
#9 0x0808a579 in pbx_thread (data=0xb74dd688) at pbx.c:2431
#10 0x00117e21 in pthread_start_thread () from /lib/i686/libpthread.so.0
#11 0x004d8b1a in clone () from /lib/i686/libc.so.6
Now just connecting to the process ID that is noted by asterisk in the CLI
as the main process nun:
running info thread, thread apply all bt, if no output, just using bt
Process Num: 15377
#0 0x0011d30b in read () from /lib/i686/libpthread.so.0
#1 0x00000000 in ?? ()
Now just connecting each process hoping for more information to be given:
(Process ID is number just before each block)
32062 No Output
32065
#0 0x0011d508 in accept () from /lib/i686/libpthread.so.0
#1 0x080aabfa in accept_thread (ignore=0x0) at manager.c:1345
#2 0x00117e21 in pthread_start_thread () from /lib/i686/libpthread.so.0
#3 0x004d8b1a in clone () from /lib/i686/libc.so.6
32066
#0 0x0011a094 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
#1 0x00119a28 in __pthread_wait_for_restart_signal ()
from /lib/i686/libpthread.so.0
#2 0x0011b468 in __pthread_lock () from /lib/i686/libpthread.so.0
#3 0x001182d6 in pthread_mutex_lock () from /lib/i686/libpthread.so.0
#4 0x0039433b in find_peer (peer=0x574d00 "17048372334", sin=0x0,
realtime=1)
at chan_sip.c:1551
#5 0x0038f0de in sip_devicestate (data=0x574e30) at chan_sip.c:10003
#6 0x080c0793 in ast_device_state (device=0xb741965c "SIP/17048372334")
at devicestate.c:93
#7 0x080c0cd1 in do_changes (data=0x0) at devicestate.c:146
#8 0x00117e21 in pthread_start_thread () from /lib/i686/libpthread.so.0
#9 0x004d8b1a in clone () from /lib/i686/libc.so.6
32067
#0 0x004d1ef1 in select () from /lib/i686/libc.so.6
#1 0x002ff59c in ?? () from /usr/lib/asterisk/modules/chan_modem.so
#2 0x006cce50 in ?? ()
#3 0x00000000 in ?? ()
32068
#0 0x004d1ef1 in select () from /lib/i686/libc.so.6
#1 0x007d8bf4 in ?? () from /usr/lib/asterisk/modules/res_features.so
#2 0x00be6c28 in ?? ()
#3 0x00be6d30 in ?? ()
#4 0x00000000 in ?? ()
32070
#0 0x004cfb7a in poll () from /lib/i686/libc.so.6
#1 0x080544c4 in ast_io_wait (ioc=0x83cd5f8, howlong=-4) at io.c:259
#2 0x00f5246a in do_monitor (data=0x0) at chan_mgcp.c:3445
#3 0x00117e21 in pthread_start_thread () from /lib/i686/libpthread.so.0
#4 0x004d8b1a in clone () from /lib/i686/libc.so.6
32073
#0 0x004cfb7a in poll () from /lib/i686/libc.so.6
#1 0x080544c4 in ast_io_wait (ioc=0x83d4330, howlong=-4) at io.c:259
#2 0x00bfe89a in network_thread (ignore=0x0) at chan_iax2.c:7794
#3 0x00117e21 in pthread_start_thread () from /lib/i686/libpthread.so.0
#4 0x004d8b1a in clone () from /lib/i686/libc.so.6
32074
#0 0x004cfb7a in poll () from /lib/i686/libc.so.6
#1 0x080544c4 in ast_io_wait (ioc=0x83dff08, howlong=-4) at io.c:259
#2 0x0031d3ff in do_monitor (data=0x0) at chan_skinny.c:3022
#3 0x00117e21 in pthread_start_thread () from /lib/i686/libpthread.so.0
#4 0x004d8b1a in clone () from /lib/i686/libc.so.6
Stopping debugging as each time a process (child processes not the larger
cpu and CLi reported ones) is debugged it dies off and spawns a new one, I'd
be here forever following an infinite loop.
Thanks for any information, especially since other users seem to be having
this problem.. I'm not much of a developer on the backend, I mainly stay on
this list for information about what's going on with development and the
occasional question.
Sherwood McGowan
->-----Original Message-----
->From: asterisk-dev-bounces at lists.digium.com
->[mailto:asterisk-dev-bounces at lists.digium.com] On Behalf Of
->Brian Capouch
->Sent: Wednesday, August 24, 2005 3:13 AM
->To: Asterisk Developers Mailing List
->Subject: Re: [Asterisk-Dev] CVS-HEAD 08/13/2005 SIP DEADLOCKING
->
->Olle E. Johansson wrote:
->> Sherwood McGowan wrote:
->>
->>>I can supply debug output if needed, but mainly need to know if
->>>there's been updates since 8/13 that are fixing the Sip
->Deadlock problem.
->>>
->>
->> Which SIP deadlock problem? YOu have to tell us more.
->>
->
->I don't know for sure that it's a *SIP* deadlock problem, but
->there's definitely a deadlock problem in the last two CVS
->versions I built, both fetched as full sourcetree downloads
->in the last week. It dies a random, silent death, and all
->operations seem to stop cold once it occurs.
->
->My system is in production, so I had no choice but to revert.
-> I'm going to try to get a test machine up if the problem
->continues, and I'll see what I can do to tickle the system to
->try to find out what's making it stall.
->
->The only thing I can say so far is that a random amount of
->time after startup it just ceases operation, and any commands
->entered at the CLI then have no effect. "stop now" doesn't
->work, either, and the only way to get going again is to exit
->from the CLI (which *does* work) and then use "kill" to stop it.
->
->I've seen 2-3 reports on the users list that smells of the
->same syndrome.
->
->B.
->_______________________________________________
->Asterisk-Dev mailing list
->Asterisk-Dev at lists.digium.com
->http://lists.digium.com/mailman/listinfo/asterisk-dev
->To UNSUBSCRIBE or update options visit:
-> http://lists.digium.com/mailman/listinfo/asterisk-dev
->
More information about the asterisk-dev
mailing list