[asterisk-dev] Pjsip segfault

Tomec Martin tomec at ipex.cz
Wed Jun 28 05:49:56 CDT 2017


-----Original Message-----
From: asterisk-dev-bounces at lists.digium.com [mailto:asterisk-dev-bounces at lists.digium.com] On Behalf Of Joshua Colp
Sent: Friday, June 16, 2017 4:44 PM
To: asterisk-dev at lists.digium.com
Subject: Re: [asterisk-dev] Pjsip segfault

On Fri, Jun 16, 2017, at 11:37 AM, Tomec Martin wrote:
> On Fri, Jun 16, 2017, at 11:10 AM, Tomec Martin wrote:
> > Hi,
> > I am looking at issue
> > https://issues.asterisk.org/jira/browse/ASTERISK-27037 and so far I have
> > found that:
> > In asterisk function ast_sip_send_stateful_response, we receive message
> > via pjsip_tsx_recv_msg then prepare answer and send answer via
> > pjsip_tsx_send_msg.
> > Before we send the answer, the pjsip_transaction structure can be deleted
> > by some pjsip callbacks. So the pjsip_tsx_send_msg is called with invalid
> > pjsip_transaction.
> > 
> > Is anybody aware of correct way how to prevent pjsip_transaction from
> > being deleted? Or is there a way to determine that transaction was
> > deleted (by some callback)?
> 
> Have you determined what callbacks will occur that can cause it to
> happen? Are you referring to timer entries for example?
> 
> I can't tell exactly. We suspect that it could be caused by some bad
> implemented webrtc client. 
> From the log it seems to some "transport error" while receiving message.
> And the terminated transaction is then destroyed by timer:
> [Jun 13 12:45:18] DEBUG[8093] pjproject:             tsx0x7fe4bc0631a8
> ..Transaction created for Request msg REGISTER/cseq=2
> (rdata0x7fe494118e18)
> [Jun 13 12:45:18] DEBUG[8093] pjproject:             tsx0x7fe4bc0631a8
> .Incoming Request msg REGISTER/cseq=2 (rdata0x7fe494118e18) in state Null
> [Jun 13 12:45:18] DEBUG[8093] pjproject:             tsx0x7fe4bc0631a8
> ..State changed from Null to Trying, event=RX_MSG
> [Jun 13 12:45:18] DEBUG[8092] pjproject:             tsx0x7fe4bc0631a8
> State changed from Trying to Terminated, event=TRANSPORT_ERROR
> [Jun 13 12:45:18] DEBUG[8092] pjproject:             tsx0x7fe4bc0631a8
> Timeout timer event
> [Jun 13 12:45:18] DEBUG[8092] pjproject:             tsx0x7fe4bc0631a8
> .State changed from Terminated to Destroyed, event=TIMER
> [Jun 13 12:45:18] DEBUG[8092] pjproject:             tsx0x7fe4bc0631a8
> Transaction destroyed!
> [Jun 13 12:45:19] DEBUG[8093] pjproject:             tsx0x7fe4bc0631a8
> .Sending Response msg 200/REGISTER/cseq=2 (tdta0x7fe4bc1e5aa8) in state
> Destroyed
> We can replicate this only in production environment, but it is possible
> to add some logging if needed...

I think in order to determine the proper path forward we need to
understand the specific scenario, the threads involved, and the specific
interaction that caused it to happen. For example I wouldn't expect the
above to happen, that is I don't know what would cause it to transition
to a TRANSPORT_ERROR in that case if a message hasn't been sent yet (and
it doesn't look one has yet).

We suspect that it is caused by jsSip client - this client can sometimes send 
many REGISTER requests without waiting for response. The second REGISTER 
can cause TRANSPORT_ERROR before sending reply. Moreover it can be amplified 
by our "slow" database, so the reply is sent later - after the memory is deleted.
Is somebody able to confirm this? Maybe by some test framework, send multiple 
REGISTER requests (with CSeq incremented)

Kind Regards 
Martin Tomec

P.S.: jsSip client bug is described at https://github.com/versatica/JsSIP/issues/414




More information about the asterisk-dev mailing list