[Asterisk-Dev] Possible deadlock issue

Tamas jalsot at gmail.com
Sun Dec 18 04:08:10 MST 2005


Hello,

on a Asterisk SVN-branch-1.2-r7406M production box we noticed twice
system death (asterisk processes were running, so no core dump
available) during 3 days.

The call center was used at time of malfunction only for outbound calls
(IAX2 softphones to PRI). Calls were recorded with monitor application.

In the 1st failure I was able to log onto console (asterisk -rv) but the
console stucked many times - this means, did not react for anything
(running from screen).
In the 2nd case I wasn't able to log into console.
In both cases the asterisk process ate 98-99% of CPU and did no any
usable work (we was not able to place calls). 'restart now' command
didn't work. kill itself didn't work - just kill -9 at the end.

Some interesting things I found:
Dec 17 13:58:26 DEBUG[21182] channel.c: Avoiding initial deadlock for
'Zap/32-1'
Dec 17 13:58:26 VERBOSE[7937] logger.c:     -- Zap/32-1 answered
IAX2/3225-35
Dec 17 13:58:26 WARNING[21182] app_queue.c: Failed to create update thread!
Dec 17 13:58:26 DEBUG[21182] channel.c: Avoiding initial deadlock for
'IAX2/3225-35'
Dec 17 13:58:26 WARNING[21182] app_queue.c: Failed to create update thread!
Dec 17 13:58:26 WARNING[21182] app_queue.c: Failed to create update thread!

Last lines in messages log were:
Dec 17 13:58:30 WARNING[21182] app_queue.c: Failed to create update thread!
Dec 17 13:58:30 DEBUG[7984] chan_iax2.c: We're hanging up IAX2/3202-47
now...
Dec 17 13:58:30 DEBUG[7984] chan_iax2.c: Really destroying IAX2/3202-47
now...
Dec 17 13:58:30 VERBOSE[7984] logger.c:     -- Hungup 'IAX2/3202-47'
Dec 17 13:58:30 WARNING[21182] app_queue.c: Failed to create update thread!
Dec 17 13:58:30 DEBUG[21188] chan_zap.c: Echo cancellation already on
Dec 17 13:58:30 DEBUG[21182] channel.c: Avoiding initial deadlock for
'Zap/35-1'
Dec 17 13:58:30 VERBOSE[7947] logger.c:     -- Zap/35-1 answered
IAX2/3224-43
Dec 17 13:58:30 WARNING[21182] app_queue.c: Failed to create update thread!
Dec 17 13:58:30 WARNING[21182] app_queue.c: Failed to create update thread!
Dec 17 13:58:30 DEBUG[21185] chan_iax2.c: chan_iax2: ast_sched_runq ran
772 scheduled tasks all at once
Dec 17 13:58:35 DEBUG[21188] chan_zap.c: Echo cancellation already on
Dec 17 13:58:35 DEBUG[21182] channel.c: Avoiding initial deadlock for
'Zap/46-1'
Dec 17 13:58:35 VERBOSE[7967] logger.c:     -- Zap/46-1 answered
IAX2/3235-44
Dec 17 13:58:35 WARNING[21182] app_queue.c: Failed to create update thread!
Dec 17 13:58:37 DEBUG[7458] channel.c: Didn't get a frame from channel:
Zap/43-1
Dec 17 13:58:37 DEBUG[7458] channel.c: Bridge stops bridging channels
IAX2/3267-31 and Zap/43-1

In logs I noticed many times:
DEBUG[21182] channel.c: Avoiding initial deadlock for 'Zap/36-1'

What can this mean?
I checked app_queue and saw that the given message came from a method
which intended to handle device state changes and it had to issue
another thread to make the work (in app_queue). I have three things in
my mind:
1. there is something wrong with device state handling - not related to
app_queue
2. there is something wrong in system not allowing to start new thread -
not enough resources? I doubt.
3. something wrong with app_queue

System specs:
- Asterisk 1.2 SVN branch with some small modifications, patches.
- Linux-2.6.13.2, Ubuntu 5.04
- SuperMicro Dual Xeon 2.8GHz (HT disabled), 2GB RAM, HW SATA RAID5
- 2xDigium TE110P

What can I do in such case? That's a production box, we have a very
similar box (just the motherboard differs a bit) with exactly the same
asterisk version, kernel, configuration and didn't notice the problem at
all. We are restarting asterisk after midnight every day.

Any idea what do I have to check in configuration? What to try if the
problem occurs again, how to help you, the developers to catch the
source of the issue (maybe a bug?)?

Thanks in advance,
	Tamas




More information about the asterisk-dev mailing list