[asterisk-bugs] [JIRA] (ASTERISK-25941) chan_pjsip: Crash on an immediate SIP final response

Javier Riveros (JIRA) noreply at issues.asterisk.org
Thu May 19 12:30:56 CDT 2016


    [ https://issues.asterisk.org/jira/browse/ASTERISK-25941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=230703#comment-230703 ] 

Javier Riveros  commented on ASTERISK-25941:
--------------------------------------------

@Mark thank you now i was unable to make it crash i make like ~3 hours test, so seems like the patch work :) 

> chan_pjsip: Crash on an immediate SIP final response
> ----------------------------------------------------
>
>                 Key: ASTERISK-25941
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-25941
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Resources/res_ari, Resources/res_pjsip
>    Affects Versions: 13.8.0, 13.8.1, 13.8.2, 13.9.0
>            Reporter: Javier Riveros 
>         Attachments: backtrace.txt, dialog_find_refinement.diff, full_after_patch_1.zip, full_debug, gdb_after_patch_1.txt, gdb.txt, kern.log, session_end_response.diff
>
>
> *Description*
> I start experience constant asterisk crashes on our pro asterisk boxes seems like when the box is a little busy (+10 calls) asterisk is more susceptible.
> So the issue seems to be for outbound/inbound calls, i see this more frequent in outbound calls. Doing a outbound call into the stasis app and get a fast busy signal causes race condition opening the door to  crash occur.
> 1) outbound call  endpoint -> asterisk -> trunk.
> 2) trunk (486 Busy Here) -> asterisk   (crash).
> You will see a segfault like this :
> {noformat}
> 2016-04-19 18:41:30.783 | info | kernel[-] |[436624.469629] asterisk[3737]: segfault at c0 ip 00000000004ce35b sp 00007f00ee7446d0 error 4 in asterisk[400000+2bc000]
> {noformat}
> And a coredumps like this:
> {noformat}
> #0  0x00000000004ce35b in ast_channel_dialed_causes_add (chan=0x0, cause_code=0x7f96ba940860, datalen=109) at channel_internal_api.c:1448
>         ao2_cause_code = 0x0
> #1  0x00000000004b7423 in ast_channel_hangupcause_hash_set (chan=0x0, cause_code=0x7f96ba940860, datalen=109) at channel.c:4356
>         causevar = '\000' <repeats 16 times>, "@\t\224\272\226\177\000\000X\264\f\354\226\177\000\000p\a\224\272\226\177\000\000\236\341J\000\000\000\000\000\240\a\224\272\226\177\000\000h\316\f\354\226\177\000\000@\b\224\272\226\177\000\000\212\347J\000\000\000\000\000m\000\000\000\000\000\000\000`\b\224\272\226\177\000\000\260\a\224\272!\000\000\000h\316\f\354\226\177\000\000\004\000\000\000\000\000\000\000!", '\000' <repeats 23 times>, "m", '\000' <repeats 39 times>, "`\b\224\272\226\177", '\000' <repeats 59 times>...
>         __PRETTY_FUNCTION__ = "ast_channel_hangupcause_hash_set"
> #2  0x00007f96c35293cf in chan_pjsip_incoming_response (session=0x7f96ec0cb458, rdata=<optimized out>) at chan_pjsip.c:2319
>         status = {code = 486, reason = {ptr = <optimized out>, slen = 9}}
>         cause_code = 0x7f96ba940860
>         data_size = 109
>         __PRETTY_FUNCTION__ = "chan_pjsip_incoming_response"
> #3  0x00007f96e33cf846 in handle_incoming_response (type=<optimized out>, response_priority=AST_SIP_SESSION_AFTER_MEDIA, rdata=0x7f97049443d8,
>     session=0x7f96ec0cb458) at res_pjsip_session.c:2323
>         supplement = 0x7f96ec0ccab0
>         status = {code = <optimized out>, reason = {ptr = 0x7f9704945740 "Busy HereV\224\004\227\177", slen = 9}}
> #4  handle_incoming (session=session at entry=0x7f96ec0cb458, rdata=0x7f97049443d8, response_priority=response_priority at entry=AST_SIP_SESSION_AFTER_MEDIA,
>     type=<optimized out>) at res_pjsip_session.c:2337
> No locals.
> #5  0x00007f96e33d15e8 in session_inv_on_tsx_state_changed (inv=0x7f96ec0cb218, tsx=0x7f9718010868, e=0x7f96ba940ad0) at res_pjsip_session.c:2512
>         cb = 0x0
>         session = 0x7f96ec0cb458
>         tdata = 0x7f96ba940ad0
>         __PRETTY_FUNCTION__ = "session_inv_on_tsx_state_changed"
> #6  0x00007f972448b2d4 in mod_inv_on_tsx_state (tsx=0x7f9718010868, e=0x7f96ba940ad0) at ../src/pjsip-ua/sip_inv.c:699
>         dlg = 0x7f9718048e08
>         inv = 0x7f96ec0cb218
> #7  0x00007f97244d0fb0 in pjsip_dlg_on_tsx_state (dlg=0x7f9718048e08, tsx=0x7f9718010868, e=0x7f96ba940ad0) at ../src/pjsip/sip_dialog.c:2013
>         i = 2
> #8  0x00007f97244d17c0 in mod_ua_on_tsx_state (tsx=0x7f9718010868, e=0x7f96ba940ad0) at ../src/pjsip/sip_ua_layer.c:178
>         dlg = 0x7f9718048e08
> #9  0x00007f97244ca70c in tsx_set_state (tsx=0x7f9718010868, state=PJSIP_TSX_STATE_COMPLETED, event_src_type=PJSIP_EVENT_RX_MSG,
>     event_src=0x7f97049443d8) at ../src/pjsip/sip_transaction.c:1213
>         e = {prev = 0x7f9704004ef0, next = 0x7f97040050c8, type = PJSIP_EVENT_TSX_STATE, body = {timer = {entry = 0x7f97049443d8}, tsx_state = {src = {
>                 rdata = 0x7f97049443d8, tdata = 0x7f97049443d8, timer = 0x7f97049443d8, status = 76825560, data = 0x7f97049443d8},
>               tsx = 0x7f9718010868, prev_state = 3, type = PJSIP_EVENT_RX_MSG}, tx_msg = {tdata = 0x7f97049443d8}, tx_error = {tdata = 0x7f97049443d8,
>               tsx = 0x7f9718010868}, rx_msg = {rdata = 0x7f97049443d8}, user = {user1 = 0x7f97049443d8, user2 = 0x7f9718010868, user3 = 0x300000003,
>               user4 = 0x7f97049443d8}}}
>         prev_state = PJSIP_TSX_STATE_PROCEEDING
> ....
> {noformat}
> *To reproduce*
> I was able to reproduce this easy i just do the following. with two asterisk boxes one acting as a carrier/trunk "Asterisk A" and another with my basic stasis app "Asterisk B".
> 1) on "Asterisk A" i just create a simple dialplan that response 486 Busy or another 4XX code, this simulating our carrier given a fast 486 busy signal.
> {code}
> exten => _[+][1][7]X.,1,Noop("test")
>  same => n,Busy()
>  same => n,Hangup()
> exten => _[+][1][8]X.,1,Noop("test")
>  same => n,Dial(SIP/1002)  ; this endpoint isn't register
>  same => n,Hangup()
> {code} 
> 2) on "Asterisk B" in other to add a little of load i just stress the box a little bit until load average were like 2 "normal busy load average" {{sudo stress -d 2 --hdd-bytes 512M}}
> 3) on "Asterisk B" once the the box is busy i just make couple of outbound calls and asterisk crash, "this simulate an outbound call to a carrier that response a fast busy signal"
> There is some external ways i'm workaround this but please asterisk don't let you crash in fast SIP ERROR codes :).
> Attached are the coredumps and debug logs anything else is needed let me know.
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list