[asterisk-bugs] [JIRA] (ASTERISK-29286) chan_pjsip: Deadlock between sending response and transaction layer

Kevin Harwell (JIRA) noreply at issues.asterisk.org
Fri Feb 12 11:35:15 CST 2021


    [ https://issues.asterisk.org/jira/browse/ASTERISK-29286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=253807#comment-253807 ] 

Kevin Harwell commented on ASTERISK-29286:
------------------------------------------

Definitely a dead lock:
{noformat}
Thread 71 (Thread 0x7f26c8a9e700 (LWP 3053)):
#0  0x00007f278292c1fd in __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f27829250f4 in __GI___pthread_mutex_lock (mutex=mutex at entry=0x55eb9e3942f8) at ../nptl/pthread_mutex_lock.c:115
#2  0x00007f278491e6d4 in pj_mutex_lock (mutex=0x55eb9e3942f8) at ../src/pj/os_core_unix.c:1281
#3  0x00007f2784926301 in grp_lock_acquire (p=0x55eb9e3941e8) at ../src/pj/lock.c:290
#4  0x00007f2784926301 in pj_grp_lock_acquire (grp_lock=0x55eb9e3941e8) at ../src/pj/lock.c:478
#5  0x00007f27848acdf4 in pjsip_tsx_send_msg (tsx=tsx at entry=0x7f2778a7f038, tdata=tdata at entry=0x7f27784d9028) at ../src/pjsip/sip_transaction.c:1788
#6  0x00007f27848af4d2 in pjsip_dlg_send_response (dlg=0x7f26d8733e58, tsx=0x7f2778a7f038, tdata=tdata at entry=0x7f27784d9028) at ../src/pjsip/sip_dialog.c:1594
#7  0x00007f2784870bfc in pjsip_inv_send_msg (inv=0x7f2740160cc8, tdata=tdata at entry=0x7f27784d9028) at ../src/pjsip-ua/sip_inv.c:3284
#8  0x00007f2709cac93c in ast_sip_session_send_response (session=session at entry=0x7f2740165f30, tdata=0x7f27784d9028) at res_pjsip_session.c:2566
#9  0x00007f27045732c4 in indicate (data=0x7f27781afa70) at chan_pjsip.c:1327
#10 0x000055eb99ecdee7 in ast_taskprocessor_execute (tps=tps at entry=0x55eb9d06b680) at taskprocessor.c:1237
#11 0x000055eb99ed44e0 in execute_tasks (data=0x55eb9d06b680) at threadpool.c:1354
#12 0x000055eb99ecdee7 in ast_taskprocessor_execute (tps=0x55eb9b2d0e40) at taskprocessor.c:1237
#13 0x000055eb99ed4fd4 in threadpool_execute (pool=0x55eb9b2d0c70) at threadpool.c:367
#14 0x000055eb99ed4fd4 in worker_active (worker=0x7f26d8737ac0) at threadpool.c:1137
#15 0x000055eb99ed4fd4 in worker_start (arg=arg at entry=0x7f26d8737ac0) at threadpool.c:1056
#16 0x000055eb99edd2ef in dummy_start (data=<optimized out>) at utils.c:1299
#17 0x00007f27829226db in start_thread (arg=0x7f26c8a9e700) at pthread_create.c:463
#18 0x00007f2781e5c71f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
{noformat}
{{pjsip_dlg_send_response}} acquires a lock on the dialog and then tries to acquire the transaction's group lock in {{pjsip_tsx_send_msg}}.
{noformat}
Thread 27 (Thread 0x7f2700ea8700 (LWP 18912)):
#0  0x00007f278292c1fd in __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007f27829250f4 in __GI___pthread_mutex_lock (mutex=mutex at entry=0x55eb9ebea3b8) at ../nptl/pthread_mutex_lock.c:115
#2  0x00007f278491e6d4 in pj_mutex_lock (mutex=0x55eb9ebea3b8) at ../src/pj/os_core_unix.c:1281
#3  0x00007f2784926301 in grp_lock_acquire (p=0x55eb9ebea2a8) at ../src/pj/lock.c:290
#4  0x00007f2784926301 in pj_grp_lock_acquire (grp_lock=0x55eb9ebea2a8) at ../src/pj/lock.c:478
#5  0x00007f27848ae45f in pjsip_dlg_inc_lock (dlg=0x7f26d8733e58) at ../src/pjsip/sip_dialog.c:925
#6  0x00007f27848afb23 in pjsip_dlg_on_tsx_state (dlg=0x7f26d8733e58, tsx=0x7f2778a7f038, e=0x7f2700ea4170) at ../src/pjsip/sip_dialog.c:2121
#7  0x00007f27848a7953 in tsx_set_state (flag=<optimized out>, event_src=<optimized out>, event_src_type=<optimized out>, state=<optimized out>, tsx=<optimized out>) at ../src/pjsip/sip_transaction.c:1272
#8  0x00007f27848a7953 in tsx_on_state_completed_uas (tsx=<optimized out>, event=<optimized out>) at ../src/pjsip/sip_transaction.c:3226
#9  0x00007f27848aa161 in send_msg_callback (send_state=0x7f277866ace0, sent=-171060, cont=0x7f2700ea42a4) at ../src/pjsip/sip_transaction.c:1975
#10 0x00007f278489091e in send_response_resolver_cb (status=status at entry=0, token=token at entry=0x7f277866ace0, addr=addr at entry=0x7f2700ea42f0) at ../src/pjsip/sip_util.c:1741
#11 0x00007f271bf99616 in sip_resolve (resolver=<optimized out>, pool=<optimized out>, target=0x7f2778a7f160, token=0x7f277866ace0, cb=0x7f2784890860 <send_response_resolver_cb>) at res_pjsip/pjsip_resolver.c:529
#12 0x00007f27848946db in pjsip_resolve (resolver=0x55eb9ba6b188, pool=<optimized out>, target=target at entry=0x7f2778a7f160, token=token at entry=0x7f277866ace0, cb=cb at entry=0x7f2784890860 <send_response_resolver_cb>) at ../src/pjsip/sip_resolve.c:207
#13 0x00007f278488fc91 in pjsip_endpt_resolve (endpt=endpt at entry=0x55eb9b4f2e18, pool=<optimized out>, target=target at entry=0x7f2778a7f160, token=token at entry=0x7f277866ace0, cb=cb at entry=0x7f2784890860 <send_response_resolver_cb>) at ../src/pjsip/sip_endpoint.c:1201
#14 0x00007f2784892e4b in pjsip_endpt_send_response (endpt=0x55eb9b4f2e18, res_addr=res_addr at entry=0x7f2778a7f138, tdata=tdata at entry=0x7f277845ac18, token=token at entry=0x7f2778a7f038, cb=cb at entry=0x7f27848a9dd8 <send_msg_callback>) at ../src/pjsip/sip_util.c:1816
#15 0x00007f27848a8564 in tsx_send_msg (tsx=tsx at entry=0x7f2778a7f038, tdata=tdata at entry=0x7f277845ac18) at ../src/pjsip/sip_transaction.c:2240
#16 0x00007f27848a86dd in tsx_send_msg (tdata=0x7f277845ac18, tsx=0x7f2778a7f038) at ../src/pjsip/sip_transaction.c:2118
#17 0x00007f27848a86dd in tsx_retransmit (tsx=0x7f2778a7f038, resched=1) at ../src/pjsip/sip_transaction.c:2375
#18 0x00007f27848ab345 in tsx_on_state_proceeding_uas (tsx=0x7f2778a7f038, event=<optimized out>) at ../src/pjsip/sip_transaction.c:2897
#19 0x00007f27848a7c81 in tsx_timer_callback (theap=<optimized out>, entry=0x7f2778a7f1b8) at ../src/pjsip/sip_transaction.c:1194
#20 0x00007f2784936690 in pj_timer_heap_poll (ht=0x55eb9b4f3100, next_delay=next_delay at entry=0x7f2700ea7e20) at ../src/pj/timer.c:913
#21 0x00007f278488f534 in pjsip_endpt_handle_events2 (endpt=0x55eb9b4f2e18, max_timeout=max_timeout at entry=0x7f2700ea7e70, p_count=p_count at entry=0x0) at ../src/pjsip/sip_endpoint.c:716
#22 0x00007f278488f62f in pjsip_endpt_handle_events (endpt=<optimized out>, max_timeout=max_timeout at entry=0x7f2700ea7e70) at ../src/pjsip/sip_endpoint.c:777
#23 0x00007f271bf847f4 in monitor_thread_exec (endpt=<optimized out>) at res_pjsip.c:4803
#24 0x00007f278491e9b0 in thread_main (param=0x55eb9b5bcea8) at ../src/pj/os_core_unix.c:541
#25 0x00007f27829226db in start_thread (arg=0x7f2700ea8700) at pthread_create.c:463
#26 0x00007f2781e5c71f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
{noformat}
Here {{pjsip_dlg_on_tsx_state}} attempts to lock the dialog, but must wait for it to be released by another thread (in this case Thread 71). However, before this {{send_msg_callback}} obtained and now holds the transaction's group lock. As well it appears the transaction's group lock was even acquired higher up prior to this in {{tsx_timer_callback}}

> chan_pjsip: Deadlock between sending response and transaction layer
> -------------------------------------------------------------------
>
>                 Key: ASTERISK-29286
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-29286
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Channels/chan_pjsip
>    Affects Versions: 16.16.0
>         Environment: ubuntu 18.04
>            Reporter: Mikhail
>            Assignee: Unassigned
>            Severity: Major
>         Attachments: debug_log_123456, t.18881-brief.txt, t.18881-full.txt, t.18881-info.txt, t.18881-locks.txt, t.18881-thread1.txt
>
>
> In some cases asterisk stops answering options/register, but answers invite, server is not under high load



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list