[asterisk-bugs] [JIRA] (ASTERISK-29774) Deadlock where hints lock blocks contexts lock

Mark Murawski (JIRA) noreply at issues.asterisk.org
Tue Dec 7 11:26:34 CST 2021


    [ https://issues.asterisk.org/jira/browse/ASTERISK-29774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=257300#comment-257300 ] 

Mark Murawski commented on ASTERISK-29774:
------------------------------------------

My comment got deleted...

For anyone interested: Note that we use res_timing_timerfd and get this same deadlock.

> Deadlock where hints lock blocks contexts lock
> ----------------------------------------------
>
>                 Key: ASTERISK-29774
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-29774
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: PBX/General, PBX/pbx_config
>    Affects Versions: 16.20.0, 16.22.0
>         Environment: Centos 6, 2.6.32-754.el6.x86_64 #1 SMP
>            Reporter: Steve Sether
>         Attachments: core.usi-vf01-mpls.ravon.net-2021-11-15T11-07-02-0600-brief.txt, core.usi-vf01-mpls.ravon.net-2021-11-15T11-07-02-0600-full.txt, core.usi-vf01-mpls.ravon.net-2021-11-15T11-07-02-0600-info.txt, core.usi-vf01-mpls.ravon.net-2021-11-15T11-07-02-0600-locks.txt, core.usi-vf01-mpls.ravon.net-2021-11-15T11-07-02-0600-thread1.txt, core.usi-vf01-mpls.ravon.net-2021-12-01T09-33-26-0600-brief.txt, core.usi-vf01-mpls.ravon.net-2021-12-01T09-33-26-0600-full.txt, core.usi-vf01-mpls.ravon.net-2021-12-01T09-33-26-0600-info.txt, core.usi-vf01-mpls.ravon.net-2021-12-01T09-33-26-0600-thread1.txt
>
>
> We've had a couple instances of a deadlocks recently.  This happened while we were trying to move phones from one Asterisk server to another.  It's too much to describe here what goes in in this process, but from an  Asterisk perspective it's a lot of reloading the dialplan regenerating pjsip.conf, reloading pjsip, and phones re-registering to the new Asterisk server through a proxy.
> The sip freeze has happened twice now when we move upwards of 80 phones at a time.
> The second time it happened we were able to get a core dump, and run it through the scripts built in to Asterisk.  The output is...  voluminous.  I tried running the core file through a scripting tool that for gdb that  claims to find deadlocks (and worked before), but it crashes when it's expecting (I think?) a certain kind of locking that it's not seeing.
> Mainly, I need some kind of guidance on how to find out what's locking up Asterisk.  I can't seem to find anything that references the locks being held.  i.e. I don't really know what threads are locking the others, and  other guides in using gdb don't seem to be much help.  I'm not a C programmer, so I don't have much experience in semaphores, low level locks, etc.
> I've tried to edit this file down to more unique entries, and removed anything that's doesn't look like it's locked so it's at least possibly semi-readable. There's quite a few threads that are largely duplicates
> > Any help is appreciated.  Thanks.
> {noformat}
> > Thread 113 (Thread 0x7faf4deac700 (LWP 17772)):
> > #0  0x00007fb0d292d334 in __lll_lock_wait () from /lib64/libpthread.so.0
> > #1  0x00007fb0d29285f3 in _L_lock_892 () from /lib64/libpthread.so.0
> > #2  0x00007fb0d29284d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
> > #3  0x00007fb093276c1b in pbx_load_module () at pbx_config.c:2091
> > #4  0x00007fb093276e1e in handle_cli_dialplan_reload (e=<value optimized
> > out>, cmd=<value optimized out>, a=<value optimized out>) at
> > pbx_config.c:1597
> > #5  0x00000000004ba668 in ast_cli_command_full (uid=-1, gid=-1, fd=112,
> > s=0x7fafe800d839 "dialplan reload") at cli.c:2995
> > #6  0x000000000060213b in action_command (s=0x7faf4deabcd0,
> > m=0x7faf4deab3e0) at manager.c:5318
> > #7  0x0000000000606af0 in process_message (s=0x7faf4deabcd0,
> > m=0x7faf4deab3e0) at manager.c:6689
> > #8  0x0000000000609dcb in do_message (data=0x7fb034002930) at
> > manager.c:6902
> > #9  session_do (data=0x7fb034002930) at manager.c:7007
> > #10 0x00000000005a24b7 in dummy_start (data=<value optimized out>) at
> > utils.c:1428
> >
> {noformat}
> This thread is attempting to grab the 'reload_lock' lock, but is waiting on
> ...
> I'm going to call the 'unknown thread id'. Note, this 'unknown thread id'
> is missing some of the stack trace, but will still try to make an educated
> guess.
> {noformat}
> > --
> > #7  0x00000000005137b0 in execute_state_callback (cb=0x5fa500
> > <manager_state_cb>, context=0x7faf703f3f88 "blf-danielBarnes126266",
> > exten=0x7faf703f4118 "danielBarnes126266-user1", data=0x0, reason=<value
> > optimized out>, hint=0x7faf
> > 703f4600, device_state_info=0x0) at pbx.c:3284
> > #8  0x00000000005169d6 in ast_add_hint (e=0x7faf703f4080) at pbx.c:4057
> > #9  0x0000000000521c53 in ast_add_extension2_lockopt (con=0x7faf703f3e90,
> > replace=0, extension=<value optimized out>, priority=-1, label=<value
> > optimized out>, callerid=<value optimized out>, application=<value
> > optimized out>, data
> > =0x7faf703f4060, datad=0x45bdd0 <ast_free_ptr>, registrar=0x7fb09327b9b8
> > "pbx_config", registrar_file=0x7faf70091221
> > "exec.1636994706829225.140397385758464", registrar_line=1218,
> > lock_context=1) at pbx.c:7572
> > #10 0x00000000005224da in ast_add_extension2 (con=<value optimized out>,
> > replace=<value optimized out>, extension=<value optimized out>,
> > priority=<value optimized out>, label=<value optimized out>,
> > callerid=<value optimized out>, a
> > pplication=0x7faf703f3fee
> > "PJSIP/xxxxxxx&Custom:unavailable&Custom:call_state_xxxxxxx",
> > data=0x7faf703f4060, datad=0x45bdd0 <ast_free_ptr>,
> > registrar=0x7fb09327b9b8 "pbx_config", registrar_f
> > ile=0x7faf70091221 "exec.1636994706829225.140397385758464",
> > registrar_line=1218) at pbx.c:7304
> > #11 0x00007fb09327641b in pbx_load_config (config_file=0x7fb09327b9e0
> > "extensions.conf") at pbx_config.c:1890
> > #12 0x00007fb093276c2e in pbx_load_module () at pbx_config.c:2101
> > #13 0x00007fb093276e1e in handle_cli_dialplan_reload (e=<value optimized
> > out>, cmd=<value optimized out>, a=<value optimized out>) at
> > pbx_config.c:1597
> > #14 0x00000000004ba668 in ast_cli_command_full (uid=-1, gid=-1, fd=106,
> > s=0x7faf70000f39 "dialplan reload") at cli.c:2995
> > #15 0x000000000060213b in action_command (s=0x7fb0d04e2cd0,
> > m=0x7fb0d04e23e0) at manager.c:5318
> > #16 0x0000000000606af0 in process_message (s=0x7fb0d04e2cd0,
> > m=0x7fb0d04e23e0) at manager.c:6689
> > #17 0x0000000000609dcb in do_message (data=0x7fb034005820) at
> > manager.c:6902
> > #18 session_do (data=0x7fb034005820) at manager.c:7007
> > #19 0x00000000005a24b7 in dummy_start (data=<value optimized out>) at
> > utils.c:1428
> > --
> >
> {noformat}
> This thread is holding the 'reload_lock', which is blocking Thread 113
> above. It then grabs the hints container lock, in ast_add_hint. While
> holding the hints lock, ast_add_hint then calls the manager_state_cb
> function. Looking there you can see the manager_state_cb function attempts
> to get ast_rdlock_contexts.
> So, this 'unknown thread id' holds the hints container lock, and is blocked
> getting the ast_rdlock_contexts, i.e. 'conlock'.
> Now let's jump to Thread 61 below.
> {noformat}
> We have many of these threads running dialplan reload.  I suspect something
> > else is locking them?
> >
> > Thread 110 (Thread 0x7fb0d0467700 (LWP 20543)):
> > #0  0x00007fb0d292d334 in __lll_lock_wait () from /lib64/libpthread.so.0
> > #1  0x00007fb0d29285f3 in _L_lock_892 () from /lib64/libpthread.so.0
> > #2  0x00007fb0d29284d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
> > #3  0x00007fb093276c1b in pbx_load_module () at pbx_config.c:2091
> > #4  0x00007fb093276e1e in handle_cli_dialplan_reload (e=<value optimized
> > out>, cmd=<value optimized out>, a=<value optimized out>) at
> > pbx_config.c:1597
> > #5  0x00000000004ba668 in ast_cli_command_full (uid=-1, gid=-1, fd=118,
> > s=0x7fafbc002009 "dialplan reload") at cli.c:2995
> > #6  0x000000000060213b in action_command (s=0x7fb0d0466cd0,
> > m=0x7fb0d04663e0) at manager.c:5318
> > #7  0x0000000000606af0 in process_message (s=0x7fb0d0466cd0,
> > m=0x7fb0d04663e0) at manager.c:6689
> > #8  0x0000000000609dcb in do_message (data=0x7fb03400b860) at
> > manager.c:6902
> > #9  session_do (data=0x7fb03400b860) at manager.c:7007
> > #10 0x00000000005a24b7 in dummy_start (data=<value optimized out>) at
> > utils.c:1428
> >
> >
> > Thread 120 (Thread 0x7faf4d860700 (LWP 11287)):
> > #0  0x00007fb0d292d334 in __lll_lock_wait () from /lib64/libpthread.so.0
> > #1  0x00007fb0d29285f3 in _L_lock_892 () from /lib64/libpthread.so.0
> > #2  0x00007fb0d29284d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
> > #3  0x000000000051cbb5 in ast_rdlock_contexts (c=0x7fb0bc238720, con=0x0,
> > context=0x7fb0bc2390e0 "call_to_user_core", exten=0x652011 "h", priority=1,
> > label=0x0, callerid=0x7faff4e2ff60 "6121111111", action=E_MATCH, found=0x0,
> > combined_find_spawn=0) at pbx.c:8509
> > #4  pbx_extension_helper (c=0x7fb0bc238720, con=0x0,
> > context=0x7fb0bc2390e0 "call_to_user_core", exten=0x652011 "h", priority=1,
> > label=0x0, callerid=0x7faff4e2ff60 "6126166579", action=E_MATCH, found=0x0,
> > combined_find_spawn=0) at pbx.c:2886
> > #5  0x0000000000527228 in ast_exists_extension (c=0x7fb0bc238720,
> > args=0x0) at pbx.c:4181
> > #6  __ast_pbx_run (c=0x7fb0bc238720, args=0x0) at pbx.c:4595
> > #7  0x000000000052752b in pbx_thread (data=<value optimized out>) at
> > pbx.c:4704
> > #8  0x00000000005a24b7 in dummy_start (data=<value optimized out>) at
> > utils.c:1428
> > #9  0x00007fb0d2926aa1 in start_thread () from /lib64/libpthread.so.0
> > #10 0x00007fb0d1d02c4d in clone () from /lib64/libc.so.6
> >
> >
> >
> > There's also many of these threads locked on this line in
> > taskprocessor.c:202
> >
> > Thread 119 (Thread 0x7fb0443b8700 (LWP 11099)):
> > #0  0x00007fb0d292ca00 in sem_wait () from /lib64/libpthread.so.0
> > #1  0x000000000058eab0 in ast_sem_wait (data=0x4ce9c60) at
> > /usr/src/debug/asterisk-16.20.0/include/asterisk/sem.h:61
> > #2  default_tps_processing_function (data=0x4ce9c60) at taskprocessor.c:202
> > #3  0x00000000005a24b7 in dummy_start (data=<value optimized out>) at
> > utils.c:1428
> > #4  0x00007fb0d2926aa1 in start_thread () from /lib64/libpthread.so.0
> > #5  0x00007fb0d1d02c4d in clone () from /lib64/libc.so.6
> >
> > Thread 118 (Thread 0x7fb0507ca700 (LWP 27278)):
> > #0  0x00007fb0d292d334 in __lll_lock_wait () from /lib64/libpthread.so.0
> > #1  0x00007fb0d29285f3 in _L_lock_892 () from /lib64/libpthread.so.0
> > #2  0x00007fb0d29284d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
> > #3  0x000000000051cbb5 in ast_rdlock_contexts (c=0x0, con=0x0,
> > context=0x7fb02a3f405e "from-sip-external", exten=0x7fb0016480e0
> > "16121111111", priority=1, label=0x0, callerid=0x0, action=E_MATCH,
> > found=0x0, combined_find_spawn=0) at pbx.c:8509
> > #4  pbx_extension_helper (c=0x0, con=0x0, context=0x7fb02a3f405e
> > "from-sip-external", exten=0x7fb0016480e0 "16121111111", priority=1,
> > label=0x0, callerid=0x0, action=E_MATCH, found=0x0, combined_find_spawn=0)
> > at pbx.c:2886
> > #5  0x000000000051d504 in ast_exists_extension (c=<value optimized out>,
> > context=<value optimized out>, exten=<value optimized out>, priority=<value
> > optimized out>, callerid=<value optimized out>) at pbx.c:4181
> > #6  0x00007fb05ea8766e in get_destination (session=0x7fb0016480e0,
> > rdata=<value optimized out>) at res_pjsip_session.c:3703
> > #7  0x00007fb05ea9a1ec in new_invite (rdata=0x7faffc086e78) at
> > res_pjsip_session.c:3939
> > #8  handle_new_invite_request (rdata=0x7faffc086e78) at
> > res_pjsip_session.c:4114
> > #9  session_on_rx_request (rdata=0x7faffc086e78) at
> > res_pjsip_session.c:4189
> > #10 0x00007fb0d471184e in pjsip_endpt_process_rx_data (endpt=<value
> > optimized out>, rdata=0x7faffc086e78, p=0x7fb0905fc500,
> > p_handled=0x7fb0507c9c7c) at ../src/pjsip/sip_endpoint.c:930
> > #11 0x00007fb0903d1dcd in distribute (data=0x7faffc086e78) at
> > res_pjsip/pjsip_distributor.c:955
> > #12 0x000000000058e046 in ast_taskprocessor_execute (tps=0x4388e10) at
> > taskprocessor.c:1237
> > #13 0x00000000005963e0 in execute_tasks (data=0x4388e10) at
> > threadpool.c:1354
> > #14 0x000000000058e046 in ast_taskprocessor_execute (tps=0x1274b70) at
> > taskprocessor.c:1237
> > #15 0x00000000005980a1 in threadpool_execute (arg=0x7fb01c001af0) at
> > threadpool.c:367
> > #16 worker_active (arg=0x7fb01c001af0) at threadpool.c:1137
> > #17 worker_start (arg=0x7fb01c001af0) at threadpool.c:1056
> > #18 0x00000000005a24b7 in dummy_start (data=<value optimized out>) at
> > utils.c:1428
> > #19 0x00007fb0d2926aa1 in start_thread () from /lib64/libpthread.so.0
> > #20 0x00007fb0d1d02c4d in clone () from /lib64/libc.so.6
> >
> >
> > Thread 107 (Thread 0x7faf4dd38700 (LWP 22760)):
> > #0  0x00007fb0d292d334 in __lll_lock_wait () from /lib64/libpthread.so.0
> > #1  0x00007fb0d29285f3 in _L_lock_892 () from /lib64/libpthread.so.0
> > #2  0x00007fb0d29284d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
> > #3  0x000000000051cbb5 in ast_rdlock_contexts (c=0x7fb008118cb0, con=0x0,
> > context=0x7fb008119670 "call_to_user_core", exten=0x652011 "h", priority=1,
> > label=0x0, callerid=0x7fafd866b440 "612xxxxxxx", action=E_MATCH, found=0x0,
> > combined_find_spawn=0) at pbx.c:8509
> > #4  pbx_extension_helper (c=0x7fb008118cb0, con=0x0,
> > context=0x7fb008119670 "call_to_user_core", exten=0x652011 "h", priority=1,
> > label=0x0, callerid=0x7fafd866b440 "612xxxxxxx", action=E_MATCH, found=0x0,
> > combined_find_spawn=0) at pbx.c:2886
> > #5  0x0000000000527228 in ast_exists_extension (c=0x7fb008118cb0,
> > args=0x0) at pbx.c:4181
> > #6  __ast_pbx_run (c=0x7fb008118cb0, args=0x0) at pbx.c:4595
> > #7  0x000000000052752b in pbx_thread (data=<value optimized out>) at
> > pbx.c:4704
> > #8  0x00000000005a24b7 in dummy_start (data=<value optimized out>) at
> > utils.c:1428
> > #9  0x00007fb0d2926aa1 in start_thread () from /lib64/libpthread.so.0
> > #10 0x00007fb0d1d02c4d in clone () from /lib64/libc.so.6
> >
> > Thread 105 (Thread 0x7faf4e09c700 (LWP 663)):
> > #0  0x00007fb0d292d334 in __lll_lock_wait () from /lib64/libpthread.so.0
> > #1  0x00007fb0d29285f3 in _L_lock_892 () from /lib64/libpthread.so.0
> > #2  0x00007fb0d29284d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
> > #3  0x000000000051cbb5 in ast_rdlock_contexts (c=0x7fb0bfeadc50, con=0x0,
> > context=0x7fb0bfeae610 "proxy-dial", exten=0x652011 "h", priority=1,
> > label=0x0, callerid=0x7faff82d65d0 "612xxxxxxx", action=E_MATCH, found=0x0,
> > combined_find_spawn=0) at pbx.c:8509
> > #4  pbx_extension_helper (c=0x7fb0bfeadc50, con=0x0,
> > context=0x7fb0bfeae610 "proxy-dial", exten=0x652011 "h", priority=1,
> > label=0x0, callerid=0x7faff82d65d0 "612xxxxxxx", action=E_MATCH, found=0x0,
> > combined_find_spawn=0) at pbx.c:2886
> > #5  0x0000000000527228 in ast_exists_extension (c=0x7fb0bfeadc50,
> > args=0x0) at pbx.c:4181
> > #6  __ast_pbx_run (c=0x7fb0bfeadc50, args=0x0) at pbx.c:4595
> > #7  0x000000000052752b in pbx_thread (data=<value optimized out>) at
> > pbx.c:4704
> > #8  0x00000000005a24b7 in dummy_start (data=<value optimized out>) at
> > utils.c:1428
> > #9  0x00007fb0d2926aa1 in start_thread () from /lib64/libpthread.so.0
> > #10 0x00007fb0d1d02c4d in clone () from /lib64/libc.so.6
> >
> > Thread 103 (Thread 0x7faf4e020700 (LWP 25155)):
> > #0  0x00007fb0d292d334 in __lll_lock_wait () from /lib64/libpthread.so.0
> > #1  0x00007fb0d29285f3 in _L_lock_892 () from /lib64/libpthread.so.0
> > #2  0x00007fb0d29284d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
> > #3  0x000000000051cbb5 in ast_rdlock_contexts (c=0x0, con=0x0,
> > context=0x7faf0743c47e "from-internal-phone", exten=0x7fb01800c8c0
> > "8744405", priority=1, label=0x0, callerid=0x0, action=E_MATCH, found=0x0,
> > combined_find_spawn=0) at pbx.c:8509
> > #4  pbx_extension_helper (c=0x0, con=0x0, context=0x7faf0743c47e
> > "from-internal-phone", exten=0x7fb01800c8c0 "8744405", priority=1,
> > label=0x0, callerid=0x0, action=E_MATCH, found=0x0, combined_find_spawn=0)
> > at pbx.c:2886
> > #5  0x000000000051d504 in ast_exists_extension (c=<value optimized out>,
> > context=<value optimized out>, exten=<value optimized out>, priority=<value
> > optimized out>, callerid=<value optimized out>) at pbx.c:4181
> > #6  0x00007fb05ea8766e in get_destination (session=0x7fb01800c8c0,
> > rdata=<value optimized out>) at res_pjsip_session.c:3703
> > #7  0x00007fb05ea9a1ec in new_invite (rdata=0x7faffc135f28) at
> > res_pjsip_session.c:3939
> > #8  handle_new_invite_request (rdata=0x7faffc135f28) at
> > res_pjsip_session.c:4114
> > #9  session_on_rx_request (rdata=0x7faffc135f28) at
> > res_pjsip_session.c:4189
> > #10 0x00007fb0d471184e in pjsip_endpt_process_rx_data (endpt=<value
> > optimized out>, rdata=0x7faffc135f28, p=0x7fb0905fc500,
> > p_handled=0x7faf4e01fc7c) at ../src/pjsip/sip_endpoint.c:930
> > #11 0x00007fb0903d1dcd in distribute (data=0x7faffc135f28) at
> > res_pjsip/pjsip_distributor.c:955
> > #12 0x000000000058e046 in ast_taskprocessor_execute (tps=0x37b1530) at
> > taskprocessor.c:1237
> > #13 0x00000000005963e0 in execute_tasks (data=0x37b1530) at
> > threadpool.c:1354
> > #14 0x000000000058e046 in ast_taskprocessor_execute (tps=0x1274b70) at
> > taskprocessor.c:1237
> > #15 0x00000000005980a1 in threadpool_execute (arg=0x7fb01c050b60) at
> > threadpool.c:367
> > #16 worker_active (arg=0x7fb01c050b60) at threadpool.c:1137
> > #17 worker_start (arg=0x7fb01c050b60) at threadpool.c:1056
> > #18 0x00000000005a24b7 in dummy_start (data=<value optimized out>) at
> > utils.c:1428
> > #19 0x00007fb0d2926aa1 in start_thread () from /lib64/libpthread.so.0
> > #20 0x00007fb0d1d02c4d in clone () from /lib64/libc.so.6
> >
> > Thread 101 (Thread 0x7fb05074e700 (LWP 26565)):
> > #0  0x00007fb0d292d334 in __lll_lock_wait () from /lib64/libpthread.so.0
> > #1  0x00007fb0d29285f3 in _L_lock_892 () from /lib64/libpthread.so.0
> > #2  0x00007fb0d29284d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
> > #3  0x000000000051cbb5 in ast_rdlock_contexts (c=0x7fb00813ece0, con=0x0,
> > context=0x7fb00813f6a0 "call_to_user_core", exten=0x652011 "h", priority=1,
> > label=0x0, callerid=0x7fb0ac16dd10 "952xxxxxxx", action=E_MATCH, found=0x0,
> > combined_find_spawn=0) at pbx.c:8509
> > #4  pbx_extension_helper (c=0x7fb00813ece0, con=0x0,
> > context=0x7fb00813f6a0 "call_to_user_core", exten=0x652011 "h", priority=1,
> > label=0x0, callerid=0x7fb0ac16dd10 "952xxxxxxx", action=E_MATCH, found=0x0,
> > combined_find_spawn=0) at pbx.c:2886
> > #5  0x0000000000527228 in ast_exists_extension (c=0x7fb00813ece0,
> > args=0x0) at pbx.c:4181
> > #6  __ast_pbx_run (c=0x7fb00813ece0, args=0x0) at pbx.c:4595
> > #7  0x000000000052752b in pbx_thread (data=<value optimized out>) at
> > pbx.c:4704
> > #8  0x00000000005a24b7 in dummy_start (data=<value optimized out>) at
> > utils.c:1428
> > #9  0x00007fb0d2926aa1 in start_thread () from /lib64/libpthread.so.0
> > #10 0x00007fb0d1d02c4d in clone () from /lib64/libc.so.6
> >
> >
> > Thread 95 (Thread 0x7faf4edb0700 (LWP 24067)):
> > #0  0x00007fb0d292d334 in __lll_lock_wait () from /lib64/libpthread.so.0
> > #1  0x00007fb0d29285f3 in _L_lock_892 () from /lib64/libpthread.so.0
> > #2  0x00007fb0d29284d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
> > #3  0x00000000005242bd in ast_rdlock_contexts (c=<value optimized out>,
> > context=<value optimized out>, exten=<value optimized out>,
> > device_state_info=0x0) at pbx.c:8509
> > #4  ast_hint_extension (c=<value optimized out>, context=<value optimized
> > out>, exten=<value optimized out>, device_state_info=0x0) at pbx.c:3000
> > #5  internal_extension_state_extended (c=<value optimized out>,
> > context=<value optimized out>, exten=<value optimized out>,
> > device_state_info=0x0) at pbx.c:3146
> > #6  0x000000000060486b in action_extensionstate (s=0x7faf4edafcd0,
> > m=0x7faf4edaf3e0) at manager.c:5988
> > #7  0x0000000000606af0 in process_message (s=0x7faf4edafcd0,
> > m=0x7faf4edaf3e0) at manager.c:6689
> > #8  0x0000000000609dcb in do_message (data=0x7fb034011680) at
> > manager.c:6902
> > #9  session_do (data=0x7fb034011680) at manager.c:7007
> > #10 0x00000000005a24b7 in dummy_start (data=<value optimized out>) at
> > utils.c:1428
> > #11 0x00007fb0d2926aa1 in start_thread () from /lib64/libpthread.so.0
> > #12 0x00007fb0d1d02c4d in clone () from /lib64/libc.so.6
> >
> > Thread 90 (Thread 0x7fb0d0657700 (LWP 27240)):
> > #0  0x00007fb0d292d334 in __lll_lock_wait () from /lib64/libpthread.so.0
> > #1  0x00007fb0d29285f3 in _L_lock_892 () from /lib64/libpthread.so.0
> > #2  0x00007fb0d29284d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
> > #3  0x000000000051cbb5 in ast_rdlock_contexts (c=0x0, con=0x0,
> > context=0x7fb02a3f153e "from-sip-external", exten=0x7fb00814e570
> > "1612xxxxxxx", priority=1, label=0x0, callerid=0x0, action=E_MATCH,
> > found=0x0, combined_find_spawn=0) at pbx.c:8509
> > #4  pbx_extension_helper (c=0x0, con=0x0, context=0x7fb02a3f153e
> > "from-sip-external", exten=0x7fb00814e570 "1612xxxxxxx", priority=1,
> > label=0x0, callerid=0x0, action=E_MATCH, found=0x0, combined_find_spawn=0)
> > at pbx.c:2886
> > #5  0x000000000051d504 in ast_exists_extension (c=<value optimized out>,
> > context=<value optimized out>, exten=<value optimized out>, priority=<value
> > optimized out>, callerid=<value optimized out>) at pbx.c:4181
> > #6  0x00007fb05ea8766e in get_destination (session=0x7fb00814e570,
> > rdata=<value optimized out>) at res_pjsip_session.c:3703
> > #7  0x00007fb05ea9a1ec in new_invite (rdata=0x7faffc023458) at
> > res_pjsip_session.c:3939
> > #8  handle_new_invite_request (rdata=0x7faffc023458) at
> > res_pjsip_session.c:4114
> > #9  session_on_rx_request (rdata=0x7faffc023458) at
> > res_pjsip_session.c:4189
> > #10 0x00007fb0d471184e in pjsip_endpt_process_rx_data (endpt=<value
> > optimized out>, rdata=0x7faffc023458, p=0x7fb0905fc500,
> > p_handled=0x7fb0d0656c7c) at ../src/pjsip/sip_endpoint.c:930
> > #11 0x00007fb0903d1dcd in distribute (data=0x7faffc023458) at
> > res_pjsip/pjsip_distributor.c:955
> > #12 0x000000000058e046 in ast_taskprocessor_execute (tps=0x5203c50) at
> > taskprocessor.c:1237
> > #13 0x00000000005963e0 in execute_tasks (data=0x5203c50) at
> > threadpool.c:1354
> > #14 0x000000000058e046 in ast_taskprocessor_execute (tps=0x1274b70) at
> > taskprocessor.c:1237
> > #15 0x00000000005980a1 in threadpool_execute (arg=0x7fb01c0073c0) at
> > threadpool.c:367
> > #16 worker_active (arg=0x7fb01c0073c0) at threadpool.c:1137
> > #17 worker_start (arg=0x7fb01c0073c0) at threadpool.c:1056
> > #18 0x00000000005a24b7 in dummy_start (data=<value optimized out>) at
> > utils.c:1428
> > #19 0x00007fb0d2926aa1 in start_thread () from /lib64/libpthread.so.0
> > #20 0x00007fb0d1d02c4d in clone () from /lib64/libc.so.6
> >
> >
> > Thread 61 (Thread 0x7faf4dcbc700 (LWP 13778)):
> > #0  0x00007fb0d292d334 in __lll_lock_wait () from /lib64/libpthread.so.0
> > #1  0x00007fb0d29285f3 in _L_lock_892 () from /lib64/libpthread.so.0
> > #2  0x00007fb0d29284d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
> > #3  0x000000000045cb16 in __ao2_lock (user_data=<value optimized out>,
> > lock_how=<value optimized out>, file=0x647703 "pbx.c", func=0x64b031
> > "ast_add_hint", line=4026, var=<value optimized out>) at astobj2.c:241
> > #4  0x00000000005167b9 in ast_add_hint (e=0x7faf6401dff0) at pbx.c:4026
> > #5  0x0000000000521c53 in ast_add_extension2_lockopt (con=0x7faf6401dee0,
> > replace=1, extension=<value optimized out>, priority=-1, label=<value
> > optimized out>, callerid=<value optimized out>, application=<value
> > optimized out>, data=0x7faf640017b0, datad=0x45bdd0 <ast_free_ptr>,
> > registrar=0x7fb09327b9b8 "pbx_config", registrar_file=0x0,
> > registrar_line=0, lock_context=1) at pbx.c:7572
> > #6  0x00000000005235e5 in ast_add_extension2 (context=<value optimized
> > out>, replace=1, extension=0x7faf4dcbab20 "zzzzzzz-user1", priority=-1,
> > label=0x0, callerid=0x0, application=0x7faf4dcbab36
> > "PJSIP/zzzzzzz&Custom:unavailable", data=0x7faf640017b0, datad=0x45bdd0
> > <ast_free_ptr>, registrar=0x7fb09327b9b8 "pbx_config") at pbx.c:7304
> > #7  ast_add_extension (context=<value optimized out>, replace=1,
> > extension=0x7faf4dcbab20 "zzzzzzz-user1", priority=-1, label=0x0,
> > callerid=0x0, application=0x7faf4dcbab36
> > "PJSIP/zzzzzzz&Custom:unavailable", data=0x7faf640017b0, datad=0x45bdd0
> > <ast_free_ptr>, registrar=0x7fb09327b9b8 "pbx_config") at pbx.c:6979
> > #8  0x00007fb093277204 in handle_cli_dialplan_add_extension (e=<value
> > optimized out>, cmd=<value optimized out>, a=0x7faf4dcbae90) at
> > pbx_config.c:1156
> > #9  0x00000000004ba668 in ast_cli_command_full (uid=-1, gid=-1, fd=107,
> > s=0x7faf6401dca9 "dialplan add extension
> > zzzzzzz-user1,hint,PJSIP/zzzzzzz&Custom:unavailable into blf-zzzzzzz
> > replace") at cli.c:2995
> > #10 0x000000000060213b in action_command (s=0x7faf4dcbbcd0,
> > m=0x7faf4dcbb3e0) at manager.c:5318
> > #11 0x0000000000606af0 in process_message (s=0x7faf4dcbbcd0,
> > m=0x7faf4dcbb3e0) at manager.c:6689
> > #12 0x0000000000609dcb in do_message (data=0x7fb034007790) at
> > manager.c:6902
> > #13 session_do (data=0x7fb034007790) at manager.c:7007
> > #14 0x00000000005a24b7 in dummy_start (data=<value optimized out>) at
> > utils.c:1428
> > #15 0x00007fb0d2926aa1 in start_thread () from /lib64/libpthread.so.0
> > #16 0x00007fb0d1d02c4d in clone () from /lib64/libc.so.6
> >
> {noformat}
> > Here, ast_add_extension calls find_context_locked, which locks 'conlock'
> (in ast_rdlock_contexts). ast_add_extension then calls ast_add_extension2,
> which calls ast_add_extension2_lockopt, which calls ast_add_hint.
> ast_add_hint then tries to lock the hints container.
> So thread 61 holds a read lock on the contexts list, i.e. 'conlock', which
> is blocking most of the other threads (any requesting ast_rdlock_context).
> It also is requesting a lock on the hints container, so it too is being
> blocked by the 'unknown thread id'.
> More below at the end of the reply ...
> {noformat}
> > Thread 12 (Thread 0x7fb093fff700 (LWP 7836)):
> > #0  0x00007fb0d292d334 in __lll_lock_wait () from /lib64/libpthread.so.0
> > #1  0x00007fb0d29285f3 in _L_lock_892 () from /lib64/libpthread.so.0
> > #2  0x00007fb0d29284d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
> > #3  0x000000000051dd53 in ast_rdlock_contexts (hint=0x7fb093ffe7b0 "",
> > hintsize=512, name=0x0, namesize=0, c=0x0, context=0x7fb093ffeb30 "blf",
> > exten=0x7fb093ffeae0 "yyyyyyy-user1") at pbx.c:8509
> > #4  ast_hint_extension (hint=0x7fb093ffe7b0 "", hintsize=512, name=0x0,
> > namesize=0, c=0x0, context=0x7fb093ffeb30 "blf", exten=0x7fb093ffeae0
> > "yyyyyyy-user1") at pbx.c:3000
> > #5  ast_get_hint (hint=0x7fb093ffe7b0 "", hintsize=512, name=0x0,
> > namesize=0, c=0x0, context=0x7fb093ffeb30 "blf", exten=0x7fb093ffeae0
> > "yyyyyyy-user1") at pbx.c:4143
> > #6  0x00000000005fa566 in manager_state_cb (context=0x7fb093ffeb30 "blf",
> > exten=0x7fb093ffeae0 "yyyyyyy-user1", info=0x7fb093ffea10, data=<value
> > optimized out>) at manager.c:7299
> > #7  0x00000000005137b0 in execute_state_callback (cb=0x5fa500
> > <manager_state_cb>, context=0x7fb093ffeb30 "blf", exten=0x7fb093ffeae0
> > "yyyyyyy-user1", data=0x0, reason=<value optimized out>,
> > hint=0x7faf301e6670, device_state_info=0x0) at pbx.c:3284
> > #8  0x0000000000515834 in device_state_notify_callbacks
> > (hint=0x7faf301e6670, hint_app=<value optimized out>) at pbx.c:3427
> > #9  0x0000000000523801 in device_state_cb (unused=<value optimized out>,
> > sub=<value optimized out>, msg=<value optimized out>) at pbx.c:3659
> > #10 0x0000000000573250 in dispatch_exec_async (local=<value optimized
> > out>) at stasis.c:1266
> > #11 0x000000000058e0ee in ast_taskprocessor_execute (tps=0x1776c10) at
> > taskprocessor.c:1235
> > #12 0x000000000058eac8 in default_tps_processing_function (data=0x1757ce0)
> > at taskprocessor.c:209
> > #13 0x00000000005a24b7 in dummy_start (data=<value optimized out>) at
> > utils.c:1428
> > #14 0x00007fb0d2926aa1 in start_thread () from /lib64/libpthread.so.0
> > #15 0x00007fb0d1d02c4d in clone () from /lib64/libc.so.6
> >
> {noformat}
> This appears to be a "classic" deadlock. The 'unknown thread id' is holding
> the hints lock and is being blocked by the contexts lock ('conlock').
> Thread 61 is holding the contexts lock ('conlock') and is being blocked by
> the hints lock.
> You did not explicitly state what version of Asterisk 16, but it appears
> you are running Asterisk 16.20. While not the latest release I have tracked
> this in the latest code, so believe it is still an existing problem. Please
> file an issue for this on the issue tracker: https://issues.asterisk.org/.
> Please include your description here, and attach any files you have from
> running the ast_coredumper script.



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list