[asterisk-bugs] [JIRA] (ASTERISK-27006) app_queue: Crash when hanging up with realtime queues

Etienne Allovon (JIRA) noreply at issues.asterisk.org
Fri Aug 18 07:39:09 CDT 2017


    [ https://issues.asterisk.org/jira/browse/ASTERISK-27006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=238134#comment-238134 ] 

Etienne Allovon edited comment on ASTERISK-27006 at 8/18/17 7:37 AM:
---------------------------------------------------------------------

We have a similar problem (at least similar backtrace) :

{code}
...
[New LWP 12021]
[New LWP 12031]
[New LWP 12256]
[New LWP 12238]
[New LWP 12258]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/sbin/asterisk -g -U asterisk'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fb86aa1d029 in handle_hangup (userdata=0x13854f0, sub=<optimized out>, msg=<optimized out>) at app_queue.c:7321
7321    app_queue.c: No such file or directory.
(gdb) bt
#0  0x00007fb86aa1d029 in handle_hangup (userdata=0x13854f0, sub=<optimized out>, msg=<optimized out>) at app_queue.c:7321
#1  0x00000000005bdb88 in router_dispatch (data=0xa4edc0, sub=0xa1efe0, message=0x7fb88002ec80) at stasis_message_router.c:201
#2  0x00000000005b226a in subscription_invoke (message=0x7fb88002ec80, sub=0xa1efe0) at stasis.c:434
#3  dispatch_exec_async (local=<optimized out>) at stasis.c:715
#4  0x00000000005ca2c8 in ast_taskprocessor_execute (tps=tps at entry=0xa1f130) at taskprocessor.c:963
#5  0x00000000005d144e in execute_tasks (data=0xa1f130) at threadpool.c:1322
#6  0x00000000005ca21d in ast_taskprocessor_execute (tps=0x9daa90) at taskprocessor.c:965
#7  0x00000000005d1cc0 in threadpool_execute (pool=0x9dbde0) at threadpool.c:351
#8  worker_active (worker=0x7fb8ac000c80) at threadpool.c:1105
#9  worker_start (arg=arg at entry=0x7fb8ac000c80) at threadpool.c:1024
#10 0x00000000005db26c in dummy_start (data=<optimized out>) at utils.c:1235
#11 0x00007fb8c3c55064 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#12 0x00007fb8c2f3d62d in clone () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt full
#0  0x00007fb86aa1d029 in handle_hangup (userdata=0x13854f0, sub=<optimized out>, msg=<optimized out>) at app_queue.c:7321
        queue_data = 0x13854f0
        channel_blob = <optimized out>
        caller_snapshot = 0x0
        member_snapshot = 0x7fb8b40115f8
        chan = 0x0
        reason = CALLER
        __PRETTY_FUNCTION__ = "handle_hangup"
#1  0x00000000005bdb88 in router_dispatch (data=0xa4edc0, sub=0xa1efe0, message=0x7fb88002ec80) at stasis_message_router.c:201
        router = 0xa4edc0
        route = {message_type = <optimized out>, callback = 0x7fb86aa1ce2b <handle_hangup>, data = 0x13854f0}
#2  0x00000000005b226a in subscription_invoke (message=0x7fb88002ec80, sub=0xa1efe0) at stasis.c:434
No locals.
#3  dispatch_exec_async (local=<optimized out>) at stasis.c:715
        sub = 0xa1efe0
        message = 0x7fb88002ec80
#4  0x00000000005ca2c8 in ast_taskprocessor_execute (tps=tps at entry=0xa1f130) at taskprocessor.c:963
        local = {local_data = 0xa1efe0, data = 0x7fb88002ec80}
        t = 0x7fb88001d7c0
        __PRETTY_FUNCTION__ = "ast_taskprocessor_execute"
#5  0x00000000005d144e in execute_tasks (data=0xa1f130) at threadpool.c:1322
        tps = 0xa1f130
#6  0x00000000005ca21d in ast_taskprocessor_execute (tps=0x9daa90) at taskprocessor.c:965
        local = {local_data = 0x7fb8ac000cb8, data = 0x90}
        t = 0x7fb880033fc0
        __PRETTY_FUNCTION__ = "ast_taskprocessor_execute"
#7  0x00000000005d1cc0 in threadpool_execute (pool=0x9dbde0) at threadpool.c:351
No locals.
#8  worker_active (worker=0x7fb8ac000c80) at threadpool.c:1105
No locals.
#9  worker_start (arg=arg at entry=0x7fb8ac000c80) at threadpool.c:1024
        worker = 0x7fb8ac000c80
        saved_state = <optimized out>
        __PRETTY_FUNCTION__ = "worker_start"
#10 0x00000000005db26c in dummy_start (data=<optimized out>) at utils.c:1235
        __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {140431140109520, -471374274666827947, 0, 140431139705760, 20, 140429859870464, 434008350720126805, -471374776567468203}, __mask_was_saved = 0}}, __pad = {0x7fb85fe9ceb0, 
            0x0, 0x7fb8c2fa1812 <__libc_thread_freeres+34>, 0x0}}
        __cancel_arg = 0x7fb85fe9d700
        __not_first_call = <optimized out>
        ret = <optimized out>
        a = {start_routine = 0x5d193d <worker_start>, data = 0x7fb8ac000c80, name = 0x7fb8ac38bcd0 "worker_start         started at [ 1079] threadpool.c worker_thread_start()"}
#11 0x00007fb8c3c55064 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#12 0x00007fb8c2f3d62d in clone () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
{code}

The crash occurs in {{handle_hangup}} in {{app_queue.c}} at :
{code}
        ast_queue_log(queue_data->queue->name, caller_snapshot->uniqueid, queue_data->member->membername,
                        reason == CALLER ? "COMPLETECALLER" : "COMPLETEAGENT", "%ld|%ld|%d",
                (long) (queue_data->starttime - queue_data->holdstart),
                (long) (time(NULL) - queue_data->starttime), queue_data->caller_pos);
{code}
because caller_snapshot is null.

Our problem looks more like ASTERISK-27166. But in our case the problem occurs after a channel redirect followed by a direct transfer.

Here's how to reproduce :

Given three users SIP : U1, U2, U3
Given one queue Q1 with one member U2 (added as a SIP member : SIP/U2)

U1 calls Q1
U2 rings
U2 answers
U2 *redirects* (seems important) U1 in Q1 (in asterisk CLI) {{channel redirect <SIP/U1> <Q1 context>,<Q1 exten>,1}}
U2 rings (he is called back by the queue Q1 again)
U2 answers
U2 direct transfer (via his phone) to U3
U3 rings
U3 answers
U3 hangup
-> then *asterisk crashes*


I tested it on vanilla asterisk :
- asterisk 13.13.1 vanilla : OK
- asterisk 13.14.0 vanilla : OK
- asterisk 13.14.1 vanilla : OK
- asterisk 13.15.0 vanilla : *crashes*
- asterisk 13.16.0 vanilla : *crashes*
- asterisk 13.17.0 vanilla : *crashes*


Below is the backtrace on a an asterisk 13.15.0 vanilla :
{code}
[New LWP 20876]
[New LWP 20870]
[New LWP 20886]
[New LWP 20889]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `asterisk -cvvvg'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f649520afec in handle_hangup (userdata=0x7f64cc002890, sub=0x0, sub at entry=0x7f64cc2264f0, msg=0x7f649522268e, msg at entry=0x7f64c80108a0) at app_queue.c:6221
6221            ast_queue_log(queue_data->queue->name, caller_snapshot->uniqueid, queue_data->member->membername,
(gdb) bt
#0  0x00007f649520afec in handle_hangup (userdata=0x7f64cc002890, sub=0x0, sub at entry=0x7f64cc2264f0, msg=0x7f649522268e, msg at entry=0x7f64c80108a0) at app_queue.c:6221
#1  0x00000000005c2ece in router_dispatch (data=0x7f64cc226440, sub=0x7f64cc2264f0, message=0x7f64c80108a0) at stasis_message_router.c:201
#2  0x00000000005b6b3b in subscription_invoke (message=0x7f64c80108a0, sub=0x7f64cc2264f0) at stasis.c:434
#3  dispatch_exec_async (local=<optimized out>) at stasis.c:715
#4  0x00000000005cf16e in ast_taskprocessor_execute (tps=tps at entry=0x7f64cc226670) at taskprocessor.c:963
#5  0x00000000005d6400 in execute_tasks (data=0x7f64cc226670) at threadpool.c:1322
#6  0x00000000005cf0ce in ast_taskprocessor_execute (tps=0x16b0750) at taskprocessor.c:965
#7  0x00000000005d6cb8 in threadpool_execute (pool=<optimized out>) at threadpool.c:351
#8  worker_active (worker=<optimized out>) at threadpool.c:1105
#9  worker_start (arg=0x7f64d0000c70) at threadpool.c:1024
#10 0x00000000005e010a in dummy_start (data=<optimized out>) at utils.c:1235
#11 0x00007f64e5fbf064 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#12 0x00007f64e52a762d in clone () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt full
#0  0x00007f649520afec in handle_hangup (userdata=0x7f64cc002890, sub=0x0, sub at entry=0x7f64cc2264f0, msg=0x7f649522268e, msg at entry=0x7f64c80108a0) at app_queue.c:6221
        channel_blob = 0x1
        caller_snapshot = 0x0
        member_snapshot = 0x7f64bc005f78
        chan = 0x0
        reason = CALLER
        __PRETTY_FUNCTION__ = "handle_hangup"
#1  0x00000000005c2ece in router_dispatch (data=0x7f64cc226440, sub=0x7f64cc2264f0, message=0x7f64c80108a0) at stasis_message_router.c:201
        router = 0x7f64cc226440
        route = {message_type = <optimized out>, callback = 0x7f649520adf0 <handle_hangup>, data = 0x7f64cc002890}
#2  0x00000000005b6b3b in subscription_invoke (message=0x7f64c80108a0, sub=0x7f64cc2264f0) at stasis.c:434
No locals.
#3  dispatch_exec_async (local=<optimized out>) at stasis.c:715
        sub = 0x7f64cc2264f0
        message = 0x7f64c80108a0
#4  0x00000000005cf16e in ast_taskprocessor_execute (tps=tps at entry=0x7f64cc226670) at taskprocessor.c:963
        local = {local_data = 0x7f64cc2264f0, data = 0x7f64c80108a0}
        t = 0x7f64c8001af0
        __PRETTY_FUNCTION__ = "ast_taskprocessor_execute"
#5  0x00000000005d6400 in execute_tasks (data=0x7f64cc226670) at threadpool.c:1322
        tps = 0x7f64cc226670
#6  0x00000000005cf0ce in ast_taskprocessor_execute (tps=0x16b0750) at taskprocessor.c:965
        local = {local_data = 0x7f64b4003780, data = 0x7f64d0000c78}
        t = 0x7f64c8051880
        __PRETTY_FUNCTION__ = "ast_taskprocessor_execute"
#7  0x00000000005d6cb8 in threadpool_execute (pool=<optimized out>) at threadpool.c:351
No locals.
#8  worker_active (worker=<optimized out>) at threadpool.c:1105
No locals.
#9  worker_start (arg=0x7f64d0000c70) at threadpool.c:1024
        __PRETTY_FUNCTION__ = "worker_start"
#10 0x00000000005e010a in dummy_start (data=<optimized out>) at utils.c:1235
        __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {140070963126432, -8086888111099764391, 0, 140070963126832, 17, 140069867263744, 8146034564801864025, -8086887509823217319}, __mask_was_saved = 0}}, __pad = {
            0x7f648eae8ef0, 0x0, 0x7f64e530b812 <__libc_thread_freeres+34>, 0x0}}
        __cancel_arg = 0x7f648eae9700
        __not_first_call = <optimized out>
        ret = <optimized out>
        a = {start_routine = 0x5d6950 <worker_start>, data = 0x7f64d0000c70, name = 0x7f64d00020a0 "worker_start         started at [ 1079] threadpool.c worker_thread_start()"}
#11 0x00007f64e5fbf064 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#12 0x00007f64e52a762d in clone () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
{code}


was (Author: etienne_pf):
We have a similar problem (at least similar backtrace) :

{code}
...
[New LWP 12021]
[New LWP 12031]
[New LWP 12256]
[New LWP 12238]
[New LWP 12258]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/sbin/asterisk -g -U asterisk'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fb86aa1d029 in handle_hangup (userdata=0x13854f0, sub=<optimized out>, msg=<optimized out>) at app_queue.c:7321
7321    app_queue.c: No such file or directory.
(gdb) bt
#0  0x00007fb86aa1d029 in handle_hangup (userdata=0x13854f0, sub=<optimized out>, msg=<optimized out>) at app_queue.c:7321
#1  0x00000000005bdb88 in router_dispatch (data=0xa4edc0, sub=0xa1efe0, message=0x7fb88002ec80) at stasis_message_router.c:201
#2  0x00000000005b226a in subscription_invoke (message=0x7fb88002ec80, sub=0xa1efe0) at stasis.c:434
#3  dispatch_exec_async (local=<optimized out>) at stasis.c:715
#4  0x00000000005ca2c8 in ast_taskprocessor_execute (tps=tps at entry=0xa1f130) at taskprocessor.c:963
#5  0x00000000005d144e in execute_tasks (data=0xa1f130) at threadpool.c:1322
#6  0x00000000005ca21d in ast_taskprocessor_execute (tps=0x9daa90) at taskprocessor.c:965
#7  0x00000000005d1cc0 in threadpool_execute (pool=0x9dbde0) at threadpool.c:351
#8  worker_active (worker=0x7fb8ac000c80) at threadpool.c:1105
#9  worker_start (arg=arg at entry=0x7fb8ac000c80) at threadpool.c:1024
#10 0x00000000005db26c in dummy_start (data=<optimized out>) at utils.c:1235
#11 0x00007fb8c3c55064 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#12 0x00007fb8c2f3d62d in clone () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt full
#0  0x00007fb86aa1d029 in handle_hangup (userdata=0x13854f0, sub=<optimized out>, msg=<optimized out>) at app_queue.c:7321
        queue_data = 0x13854f0
        channel_blob = <optimized out>
        caller_snapshot = 0x0
        member_snapshot = 0x7fb8b40115f8
        chan = 0x0
        reason = CALLER
        __PRETTY_FUNCTION__ = "handle_hangup"
#1  0x00000000005bdb88 in router_dispatch (data=0xa4edc0, sub=0xa1efe0, message=0x7fb88002ec80) at stasis_message_router.c:201
        router = 0xa4edc0
        route = {message_type = <optimized out>, callback = 0x7fb86aa1ce2b <handle_hangup>, data = 0x13854f0}
#2  0x00000000005b226a in subscription_invoke (message=0x7fb88002ec80, sub=0xa1efe0) at stasis.c:434
No locals.
#3  dispatch_exec_async (local=<optimized out>) at stasis.c:715
        sub = 0xa1efe0
        message = 0x7fb88002ec80
#4  0x00000000005ca2c8 in ast_taskprocessor_execute (tps=tps at entry=0xa1f130) at taskprocessor.c:963
        local = {local_data = 0xa1efe0, data = 0x7fb88002ec80}
        t = 0x7fb88001d7c0
        __PRETTY_FUNCTION__ = "ast_taskprocessor_execute"
#5  0x00000000005d144e in execute_tasks (data=0xa1f130) at threadpool.c:1322
        tps = 0xa1f130
#6  0x00000000005ca21d in ast_taskprocessor_execute (tps=0x9daa90) at taskprocessor.c:965
        local = {local_data = 0x7fb8ac000cb8, data = 0x90}
        t = 0x7fb880033fc0
        __PRETTY_FUNCTION__ = "ast_taskprocessor_execute"
#7  0x00000000005d1cc0 in threadpool_execute (pool=0x9dbde0) at threadpool.c:351
No locals.
#8  worker_active (worker=0x7fb8ac000c80) at threadpool.c:1105
No locals.
#9  worker_start (arg=arg at entry=0x7fb8ac000c80) at threadpool.c:1024
        worker = 0x7fb8ac000c80
        saved_state = <optimized out>
        __PRETTY_FUNCTION__ = "worker_start"
#10 0x00000000005db26c in dummy_start (data=<optimized out>) at utils.c:1235
        __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {140431140109520, -471374274666827947, 0, 140431139705760, 20, 140429859870464, 434008350720126805, -471374776567468203}, __mask_was_saved = 0}}, __pad = {0x7fb85fe9ceb0, 
            0x0, 0x7fb8c2fa1812 <__libc_thread_freeres+34>, 0x0}}
        __cancel_arg = 0x7fb85fe9d700
        __not_first_call = <optimized out>
        ret = <optimized out>
        a = {start_routine = 0x5d193d <worker_start>, data = 0x7fb8ac000c80, name = 0x7fb8ac38bcd0 "worker_start         started at [ 1079] threadpool.c worker_thread_start()"}
#11 0x00007fb8c3c55064 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#12 0x00007fb8c2f3d62d in clone () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
{code}

The crash occurs in {{handle_hangup}} in {{app_queue.c}} at :
{code}
        ast_queue_log(queue_data->queue->name, caller_snapshot->uniqueid, queue_data->member->membername,
                        reason == CALLER ? "COMPLETECALLER" : "COMPLETEAGENT", "%ld|%ld|%d",
                (long) (queue_data->starttime - queue_data->holdstart),
                (long) (time(NULL) - queue_data->starttime), queue_data->caller_pos);
{code}
because caller_snapshot is null.

Our problem looks more like ASTERISK-27166. But in our case the problem occurs after a channel redirect followed by a direct transfer.

Here's how to reproduce :

Given three users SIP : U1, U2, U3
Given one queue Q1 with one member U2 (added as a SIP member : SIP/U2)

U1 calls Q1
U2 rings
U2 answers
U2 *redirects* (seems important) U1 in Q1 (in asterisk CLI) {{channel redirect <SIP/U1> <Q1 context>,<Q1 exten>,1}}
U2 rings (he is called back by the queue Q1 again)
U2 answers
U2 direct transfer (via his phone) to U3
U3 rings
U3 answers
U3 hangup
-> then *asterisk crashes*


I tested it on vanilla asterisk :
- asterisk 13.13.1 vanilla : OK
- asterisk 13.14.0 vanilla : OK
- asterisk 13.14.1 vanilla : OK
- asterisk 13.15.0 vanilla : *crashes*
- asterisk 13.16.0 vanilla : *crashes*
- asterisk 13.17.0 vanilla : *crashes*

> app_queue: Crash when hanging up with realtime queues
> -----------------------------------------------------
>
>                 Key: ASTERISK-27006
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-27006
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Applications/app_queue
>    Affects Versions: 13.15.0, 14.6.0
>         Environment: Debian 8, Asterisk 13.15 (13.14 did not present issue)
>            Reporter: Niklas Larsson
>         Attachments: backtrace_core.uc02.nguc.weblink.se-2017-05-17T08-02-12+0200.txt, backtrace_core.uc02.nguc.weblink.se-2017-05-17T13-38-19+0200.txt, backtrace_core.uc02.nguc.weblink.se-2017-05-23T13-31-52+0200.txt, backtrace_core.uc02.nguc.weblink.se-2017-05-26T09-10-53+0200.txt, backtrace_core.uc02.nguc.weblink.se-2017-05-30T16-04-54+0200.txt, C-00000345.txt, log.txt, queue_4.txt
>
>
> Segfault in at hangup:
> [2017-05-17 08:02:12] VERBOSE[12389][C-000003ea] bridge_channel.c: Channel PJSIP/ngcore-00000cd3 left 'simple_bridge' basic-bridge <098609d0-4ae7-47e3-980b-06a3b24f3014>
> Happens around ones a day.
> Running asterisk 13 branch, commit ddbc68b (Thu May 11 10:49:04 2017)



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list