[asterisk-bugs] [JIRA] (ASTERISK-28470) Mutex deadlock in audio_audiohook_write_list

Andre Heber (JIRA) noreply at issues.asterisk.org
Thu Jul 4 05:16:47 CDT 2019


    [ https://issues.asterisk.org/jira/browse/ASTERISK-28470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=247531#comment-247531 ] 

Andre Heber commented on ASTERISK-28470:
----------------------------------------

Our problem:
Asterisk sometimes didn't accept new calls (INVITES). Through logging, we saw that the do_monitor thread of chan_sip is locked.
The endless for-loop in do_monitor should completed every 1 second, so we implemented, that the loop makes a timestamp at the end. So, the timestamp should not be older than a second.
And we created a parallel thread, which checked every second the timestamp and send a SIGABRT to the do_monitor thread, if it hangs for 3.9 seconds or longer.
So, asterisk crashes and we have a core dump. 

Here is my analysis with gdb:
[stack_trace_1]
[mutex_1]

Through the owner, we see the thread, which locked the mutex (thread LWP 39260).

[info_threads]
[stack_trace_2]
[mutex_2]

The __kind of the audiohook->lock mutex is 0, which means, it is non-recursive and the thread locks itself.

But we use an old asterisk (version 13.1.2) and an update is planned. I saw the same code in audio_audiohook_write_list in the actual version, so I decided to create an issue.
But as you stated, this should be an recursive mutex, created in ast_audiohook_init through ast_mutex_init.

So, maybe you know that this is fixed with the actual asterisk version 16.3.0 or have an idea what is going on.

Otherwise, I think we can close this issue.


> Mutex deadlock in audio_audiohook_write_list
> --------------------------------------------
>
>                 Key: ASTERISK-28470
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-28470
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: General
>    Affects Versions: 16.3.0
>         Environment: CentOS 6.10
>            Reporter: Andre Heber
>            Assignee: Unassigned
>         Attachments: info_threads.png, mutex_1.png, mutex_2.png, stack_trace_1.png, stack_trace_2.png
>
>
> In main/audiohook.c in the function "audio_audiohook_write_list" is the following code:
> {code:java}
>     ast_audiohook_lock(audiohook);
> 		if (audiohook->status != AST_AUDIOHOOK_STATUS_RUNNING) {
> 			AST_LIST_REMOVE_CURRENT(list);
> 			removed = 1;
> 			ast_audiohook_update_status(audiohook, AST_AUDIOHOOK_STATUS_DONE);
> 			ast_audiohook_unlock(audiohook);
> {code}
> But "ast_audiohook_update_status" also locks "audiohook" "if (audiohook->status != AST_AUDIOHOOK_STATUS_RUNNING)", which results in a frozen thread.
> This happens 3x in "audio_audiohook_write_list" and 1x in "dtmf_audiohook_write_list".



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list