[asterisk-dev] How to catch the source of a deadlock?

Yousf Ateya y.ateya at starkbits.com
Thu Feb 26 03:10:34 CST 2015


Asterisk loaded the default timing modules (timerfd and pthreaf), but only
res_timing_timerfd is used.

On Tue, Jan 27, 2015 at 7:45 PM, Matthew Jordan <mjordan at digium.com> wrote:

> On Tue, Jan 27, 2015 at 11:04 AM, Yousf Ateya <y.ateya at starkbits.com>
> wrote:
> >
> > Yes, and I supplied the debug logs in the issue ASTERISK-24478 . I was
> trying to find a solution to this bug.
> >
> > What I am doing now is to run asterisk in debugger (gdb); but it prints
> TONs of debug message.
> > That is why I am looking fot any way to catch the cause of dead locks.
> >
>
> I don't think you actually have a deadlock here.
>
> First, the DETECT_DEADLOCKS option is going to spam you. I'd expect
> that when it finds something holding onto a lock for a long period of
> time, which is what your debug log shows. That being said, the
> DETECT_DEADLOCKS option - once it starts telling you something - isn't
> all that useful. Generally, I'd just run with DEBUG_THREADS when
> debugging these kinds of things.
>
> Looking at your 'core show locks' output, you don't actually have
> circular waiting. You have a thread that is holding onto an IAX call
> number lock (iaxsl[fr->callno]), and a thread that wants it. However,
> the thread holding onto the call number lock isn't waiting for another
> lock: it's just holding it in transmit_frame.
>
> Looking at the transmit_frame function:
>
> static int transmit_frame(void *data)
> {
>     struct iax_frame *fr = data;
>
>     ast_mutex_lock(&iaxsl[fr->callno]);
>
>     fr->sentyet = 1;
>
>     if (iaxs[fr->callno]) {
>         send_packet(fr);
>     }
>
>     if (fr->retries < 0) {
>         ast_mutex_unlock(&iaxsl[fr->callno]);
>         /* No retransmit requested */
>         iax_frame_free(fr);
>     } else {
>         /* We need reliable delivery.  Schedule a retransmission */
>         AST_LIST_INSERT_TAIL(&frame_queue[fr->callno], fr, list);
>         fr->retries++;
>         fr->retrans = iax2_sched_add(sched, fr->retrytime,
> attempt_transmit, fr);
>         ast_mutex_unlock(&iaxsl[fr->callno]);
>     }
>
>     return 0;
> }
>
> We can see that all paths should be unlocking iaxsl[fr->callno],
> assuming we move through the function. My guess is that we're stuck on
> iax2_sched_add, but a gdb backtrace would show for sure where that
> thread is.
>
> However, I'll say this - when Corey found a similar problem in
> ASTERISK-24451, I wasn't able to reproduce the leak in the IAX usage
> of the scheduler. So your problem may not be easily solved unless you
> can figure out why the scheduler is misbehaving.
>
> As a side note: do you have a timing module loaded?
>
> Matt
>
> --
> Matthew Jordan
> Digium, Inc. | Engineering Manager
> 445 Jan Davis Drive NW - Huntsville, AL 35806 - USA
> Check us out at: http://digium.com & http://asterisk.org
>
> --
> _____________________________________________________________________
> -- Bandwidth and Colocation Provided by http://www.api-digital.com --
>
> asterisk-dev mailing list
> To UNSUBSCRIBE or update options visit:
>    http://lists.digium.com/mailman/listinfo/asterisk-dev
>



-- 
Yousf Ateya,
StarkBits
www.starkbits.com

-- 


This e-mail message is intended only for the use of the intended recipient(s).
The information contained therein may be confidential or privileged,
and its disclosure or reproduction is strictly prohibited.
If you are not the intended recipient, please return it immediately to its sender 
at the above address and destroy it. 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20150226/23e25e88/attachment.html>


More information about the asterisk-dev mailing list