[asterisk-bugs] [JIRA] (ASTERISK-26019) Deadlock after 'core reload'

Kevin Harwell (JIRA) noreply at issues.asterisk.org
Thu May 12 13:30:56 CDT 2016


    [ https://issues.asterisk.org/jira/browse/ASTERISK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=230589#comment-230589 ] 

Kevin Harwell commented on ASTERISK-26019:
------------------------------------------

Looking at the backtrace this appears to be a deadlock down in the pbx core.

Taking a look at Thread 13 we can see that it is waiting on the "hints" lock:
{noformat}
Thread 13 (Thread 0x7f0c7c526700 (LWP 8149)):
#0  0x00007f0d34646334 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007f0d346415f3 in _L_lock_892 () from /lib64/libpthread.so.0
#2  0x00007f0d346414d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0000000000538379 in __ast_pthread_mutex_lock (filename=0x6524cb "pbx.c", lineno=3879, func=0x655f20 "ast_add_hint", mutex_name=0x653135 "hints", t=0x1d9add0) at lock.c:315
#4  0x000000000045c5d8 in __ao2_lock (user_data=0x1d9ae28, lock_how=AO2_LOCK_REQ_MUTEX, file=0x6524cb "pbx.c", func=0x655f20 "ast_add_hint", line=3879, var=0x653135 "hints") at astobj2.c:167
#5  0x00000000005742c4 in ast_add_hint (e=0x7f0d148da190) at pbx.c:3879
#6  0x000000000057e87e in ast_add_extension2_lockopt (con=0x7f0d141e2780, replace=0, extension=0x7f0d15ac8a07 "*992*10*2005", priority=-1, label=0x0, callerid=0x0, application=0x7f0c7c523b10 "Queue:700", data=0x7f0d1551dac0, datad=0x431360 <free at plt>, registrar=0x7f0d31c55470 "pbx_config", lock_context=1) at pbx.c:7268
#7  0x000000000057df66 in ast_add_extension2 (con=0x7f0d141e2780, replace=0, extension=0x7f0d15ac8a07 "*992*10*2005", priority=-1, label=0x0, callerid=0x0, application=0x7f0d150cecd5 "${DB(restapps/hints/queue/${EXTEN:8})}", data=0x7f0d1551dac0, datad=0x431360 <free at plt>, registrar=0x7f0d31c55470 "pbx_config") at pbx.c:7045
#8  0x000000000057d29e in ast_add_extension (context=0x7f0d141e2850 "restapps-device-hints", replace=0, extension=0x7f0d15ac8a07 "*992*10*2005", priority=-1, label=0x0, callerid=0x0, application=0x7f0d150cecd5 "${DB(restapps/hints/queue/${EXTEN:8})}", data=0x7f0d1551dac0, datad=0x431360 <free at plt>, registrar=0x7f0d31c55470 "pbx_config") at pbx.c:6742
#9  0x000000000057202b in internal_extension_state_extended (c=0x0, context=0x7f0d166fd5a9 "restapps-device-hints", exten=0x7f0d15ac8a07 "*992*10*2005", device_state_info=0x0) at pbx.c:3039
#10 0x00000000005720a1 in ast_extension_state (c=0x0, context=0x7f0d166fd5a9 "restapps-device-hints", exten=0x7f0d15ac8a07 "*992*10*2005") at pbx.c:3054
#11 0x000000000054e0ae in action_extensionstate (s=0x7f0c7c525a60, m=0x7f0c7c525550) at manager.c:5509
#12 0x000000000054ffdd in process_message (s=0x7f0c7c525a60, m=0x7f0c7c525550) at manager.c:6174
#13 0x0000000000550913 in do_message (s=0x7f0c7c525a60) at manager.c:6387
#14 0x0000000000550e0f in session_do (data=0x7f0cd4003058) at manager.c:6508
#15 0x00000000005e8cfd in handle_tcptls_connection (data=0x7f0cd4003058) at tcptls.c:695
#16 0x00000000005fb855 in dummy_start (data=0x7f0cd4001950) at utils.c:1235
#17 0x00007f0d3463faa1 in start_thread () from /lib64/libpthread.so.0
#18 0x00007f0d339c793d in clone () from /lib64/libc.so.6
{noformat}
Further at frame #6 we can see that prior to attempting to get the hints lock this thread has already grabbed the context lock:
{noformat}
#6  0x000000000057e87e in ast_add_extension2_lockopt (con=0x7f0d141e2780, replace=0, extension=0x7f0d15ac8a07 "*992*10*2005", priority=-1, label=0x0, callerid=0x0, application=0x7f0c7c523b10 "Queue:700", data=0x7f0d1551dac0, datad=0x431360 <free at plt>, registrar=0x7f0d31c55470 "pbx_config", lock_context=1) at pbx.c:7268
{noformat}
Now check out Thread 11. It is attempting to acquire the contexts lock:
{noformat}
Thread 11 (Thread 0x7f0cbf181700 (LWP 8199)):
#0  0x00007f0d34646334 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007f0d346415f3 in _L_lock_892 () from /lib64/libpthread.so.0
#2  0x00007f0d346414d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0000000000538379 in __ast_pthread_mutex_lock (filename=0x6524cb "pbx.c", lineno=8204, func=0x6559d0 "ast_rdlock_contexts", mutex_name=0x655828 "&conlock", t=0x8c6d00) at lock.c:315
#4  0x0000000000581454 in ast_rdlock_contexts () at pbx.c:8204
#5  0x0000000000571af9 in ast_hint_extension (c=0x0, context=0x7f0d14131960 "ext-meetme", exten=0x7f0d159cd488 "82001") at pbx.c:2887
#6  0x000000000057493b in ast_get_hint (hint=0x7f0cbf17c4b0 "\260\067\024\027\r\177", hintsize=512, name=0x0, namesize=0, c=0x0, context=0x7f0d14131960 "ext-meetme", exten=0x7f0d159cd488 "82001") at pbx.c:3997
#7  0x0000000000551bc5 in manager_state_cb (context=0x7f0d14131960 "ext-meetme", exten=0x7f0d159cd488 "82001", info=0x7f0cbf17c700, data=0x0) at manager.c:6747
#8  0x00000000005725e7 in execute_state_callback (cb=0x551b62 <manager_state_cb>, context=0x7f0d14131960 "ext-meetme", exten=0x7f0d159cd488 "82001", data=0x0, reason=AST_HINT_UPDATE_DEVICE, hint=0x7f0d16833e28, device_state_info=0x0) at pbx.c:3165
#9  0x00000000005745ba in ast_add_hint (e=0x7f0d159cd410) at pbx.c:3910
#10 0x000000000057dee7 in add_priority (con=0x7f0d14131890, tmp=0x7f0d159cd410, el=0x0, e=0x7f0d15b30e10, replace=0) at pbx.c:7009
#11 0x000000000057e598 in ast_add_extension2_lockopt (con=0x7f0d14131890, replace=0, extension=0x7f0cbf17df20 "82001", priority=-1, label=0x0, callerid=0x0, application=0x7f0d1565e6fb "confbridge:82001", data=0x7f0d15605ba0, datad=0x431360 <free at plt>, registrar=0x7f0d31c55470 "pbx_config", lock_context=1) at pbx.c:7184
#12 0x000000000057df66 in ast_add_extension2 (con=0x7f0d14131890, replace=0, extension=0x7f0cbf17df20 "82001", priority=-1, label=0x0, callerid=0x0, application=0x7f0d1565e6fb "confbridge:82001", data=0x7f0d15605ba0, datad=0x431360 <free at plt>, registrar=0x7f0d31c55470 "pbx_config") at pbx.c:7045
#13 0x00007f0d31c54288 in pbx_load_config (config_file=0x7f0d31c55460 "extensions.conf") at pbx_config.c:1841
#14 0x00007f0d31c551e4 in pbx_load_module () at pbx_config.c:2045
#15 0x00007f0d31c53254 in handle_cli_dialplan_reload (e=0x7f0d31e58c18, cmd=-4, a=0x7f0cbf1802f0) at pbx_config.c:1565
#16 0x00000000004d8835 in ast_cli_command_full (uid=499, gid=498, fd=39, s=0x7f0cbf1805e0 "dialplan reload") at cli.c:2757
#17 0x00000000004d8999 in ast_cli_command_multiple_full (uid=499, gid=498, fd=39, size=16, s=0x7f0cbf180a4f "dialplan reload") at cli.c:2783
#18 0x0000000000453134 in netconsole (vconsole=0x8e5ea0) at asterisk.c:1532
#19 0x00000000005fb855 in dummy_start (data=0x7f0d20003dc0) at utils.c:1235
#20 0x00007f0d3463faa1 in start_thread () from /lib64/libpthread.so.0
#21 0x00007f0d339c793d in clone () from /lib64/libc.so.6
{noformat}
However, at frame #9 it grabbed the "hints" lock (it gets the "hints" lock in ast_add_hint).

So Thread 11 holds the "hints" lock and is waiting on the "context" lock, while Thread 13 holds the "context" lock and is waiting on the "hints" lock.

> Deadlock after 'core reload'
> ----------------------------
>
>                 Key: ASTERISK-26019
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-26019
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>    Affects Versions: 13.8.0, 13.8.1, 13.8.2, 13.9.0
>            Reporter: Andrew Nagy
>         Attachments: backtrace-threads-lgaetzdev2-2016-05-12.txt
>
>
> This is a sporadic issue. Which makes it all the more fun.
> Occasionally while running "core reload" Asterisk will just up and stop all transactions. Extensions will unregister and calls will be dropped.
> When trying to run "core reload" again sometimes Asterisk will say "previous reload not finished" but if you try to run "core reload" again after that it just wont say anything.
> The CLI is still active at this point and there's been much debate about how to track down the bug.
> It's only an issue in Asterisk 13. I have attached GDB logs to this ticket for further analysis
> {code}
> lgaetzdev2*CLI> core reload
>     -- The previous reload command didn't finish yet
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list