[asterisk-bugs] [JIRA] (ASTERISK-25183) PJSIP: Crash on NULL channel in chan_pjsip_incoming_response despite previous checks for NULL channel

Richard Mudgett (JIRA) noreply at issues.asterisk.org
Thu Jul 2 17:22:32 CDT 2015


     [ https://issues.asterisk.org/jira/browse/ASTERISK-25183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on ASTERISK-25183 started by Richard Mudgett.

> PJSIP: Crash on NULL channel in chan_pjsip_incoming_response despite previous checks for NULL channel
> -----------------------------------------------------------------------------------------------------
>
>                 Key: ASTERISK-25183
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-25183
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Channels/chan_pjsip
>            Reporter: Matt Jordan
>            Assignee: Richard Mudgett
>         Attachments: backtrace_2003.txt, full.txt, messages.txt
>
>
> Note that this was caught by the {{channels/pjsip/basic_calls/outgoing/off-nominal/bob_incompatible_codecs}} test in the Test Suite.
> A crash occurred in the previously mentioned test due to the channel being NULL and its name being retrieved:
> {code}
> #0  0x000000000054494c in ast_channel_name (chan=0x0) at channel_internal_api.c:476
> 476	DEFINE_STRINGFIELD_GETTER_FOR(name);
> #0  0x000000000054494c in ast_channel_name (chan=0x0) at channel_internal_api.c:476
> No locals.
> #1  0x00007f42987ebce1 in chan_pjsip_incoming_response (session=0x7f42d000d3b8, rdata=0x7f4308022d98) at chan_pjsip.c:2224
>         status = {code = 200, reason = {ptr = 0x7f4308024100 "OK", slen = 2}}
>         cause_code = 0x7f42da36d670
>         data_size = 102
>         __PRETTY_FUNCTION__ = "chan_pjsip_incoming_response"
> #2  0x00007f42de1a9078 in handle_incoming_response (session=0x7f42d000d3b8, rdata=0x7f4308022d98, type=PJSIP_EVENT_TSX_STATE, response_priority=AST_SIP_SESSION_AFTER_MEDIA) at res_pjsip_session.c:2187
>         supplement = 0x7f42d000f7c0
>         status = {code = 200, reason = {ptr = 0x7f4308024100 "OK", slen = 2}}
>         __PRETTY_FUNCTION__ = "handle_incoming_response"
> #3  0x00007f42de1a923f in handle_incoming (session=0x7f42d000d3b8, rdata=0x7f4308022d98, type=PJSIP_EVENT_TSX_STATE, response_priority=AST_SIP_SESSION_AFTER_MEDIA) at res_pjsip_session.c:2201
>         __PRETTY_FUNCTION__ = "handle_incoming"
> {code}
> In Asterisk 13, this corresponds to this line of code:
> {code}
> 	/* Build and send the tech-specific cause information */
> 	/* size of the string making up the cause code is "SIP " number + " " + reason length */
> 	data_size += 4 + 4 + pj_strlen(&status.reason);
> 	cause_code = ast_alloca(data_size);
> 	memset(cause_code, 0, data_size);
> 	ast_copy_string(cause_code->chan_name, ast_channel_name(session->channel), AST_CHANNEL_NAME); // THIS LINE HERE
> {code}
> However, we previously explicitly check that the channel is non-NULL before proceeding in this function:
> {code}
> 	if (!session->channel) {
> 		return;
> 	}
> {code}
> Which ... doesn't make much sense. Even if we had a reference counting issue, this should have pointed to garbage.
> However, we can see that we are hanging up a channel at this moment in time:
> {code}
> Thread 70 (Thread 0x7f42da3ea700 (LWP 7686)):
> #0  0x00000000005fee68 in __ast_pthread_mutex_lock (filename=0x7fa55b "astmm.c", lineno=360, func=0x7fb2a7 "region_free", mutex_name=0x7fa5cb "&reglock", t=0xadfb40) at lock.c:313
> #1  0x000000000047bd8e in region_free (freed=0xb17040, reg=0x7f42f800a580) at astmm.c:360
> #2  0x000000000047c4e3 in __ast_free_region (ptr=0x7f42f800a610, file=0x7fb5ab "astobj2.c", lineno=461, func=0x7fb840 "internal_ao2_ref") at astmm.c:479
> #3  0x000000000047c81e in __ast_free (ptr=0x7f42f800a610, file=0x7fb5ab "astobj2.c", lineno=461, func=0x7fb840 "internal_ao2_ref") at astmm.c:532
> #4  0x000000000048142d in internal_ao2_ref (user_data=0x7f42f800a668, delta=-1, file=0x7fb5ab "astobj2.c", line=516, func=0x7fb823 "__ao2_ref") at astobj2.c:461
> #5  0x0000000000481969 in __ao2_ref (user_data=0x7f42f800a668, delta=-1) at astobj2.c:516
> #6  0x0000000000481a4a in __ao2_cleanup (obj=0x7f42f800a668) at astobj2.c:529
> #7  0x00007f42987e9ced in hangup (data=0x7f4314006578) at chan_pjsip.c:1744
> #8  0x000000000072a453 in ast_taskprocessor_execute (tps=0x7f42d000e698) at taskprocessor.c:768
> #9  0x000000000073dba0 in execute_tasks (data=0x7f42d000e698) at threadpool.c:1157
> #10 0x000000000072a453 in ast_taskprocessor_execute (tps=0x1484fa8) at taskprocessor.c:768
> #11 0x000000000073b0c5 in threadpool_execute (pool=0x1653518) at threadpool.c:351
> #12 0x000000000073d677 in worker_active (worker=0x7f42cc001cc8) at threadpool.c:1075
> #13 0x000000000073d2c2 in worker_start (arg=0x7f42cc001cc8) at threadpool.c:995
> #14 0x0000000000750540 in dummy_start (data=0x7f42cc001f10) at utils.c:1237
> #15 0x00000034ac6079d1 in start_thread () from /lib64/libpthread.so.0
> #16 0x00000034ac2e89dd in clone () from /lib64/libc.so.6
> {code}
> Which means that we can probably still skip past the first check on line {{2195}}, and have the {{hangup}} callback nuke out the {{session->channel}} pointer. Egads.
> Logs and backtrace attached.



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list