[asterisk-dev] How to fix a crashing spandsp

Tue Nov 12 06:57:59 CST 2013

Thanks! I have pcaps that can reproduce this crash faithfully that can be
contributed to the test suite (requires the additional dependency of
rtpplay though) (been on this issue for a few weeks already).  If the
consensus is always init, I think the only thing to do is make sure no
memory leaks happen as a result.

In the mean time I'll add the sipp script to this jira reference.

Thanks again for the pointer!
Torrey

On 12 November 2013 13:44, Matthew Jordan <mjordan at digium.com> wrote:

> On Tue, Nov 12, 2013 at 3:20 AM, Torrey Searle <tsearle at gmail.com> wrote:
>
>> Hello all,
>>
>> I have a scenario which can cause asterisk to crash in SpanDSP to crash.
>>
>> Basically res_fax.c has a 5 second timeout that, if triggered disables
>> AST_FAX_TECH_T38.   This causes the T.38 stack not to be initialized and
>> the asterisk will crash if any UDPTL packet arrives after that point.
>>
>> I have already identified two similar call flows that can trigger this.
>>
>> 1.  We are slow to receive the ACK message from the calling party.  The
>> T.38 switchover attempt of ReceiveFax will instead just set the
>> NEEDREINVITE flag.  In this case, a T.38 RE-invite can be sent (and
>> succeed!) after res_fax has timed out and disabled T.38 => crash
>>
>>
>> 2. We have sent the T.38 re-invite, but are slow to get the 200 OK from
>> the carrier, we time out, disable T.38, and the 200ok arrives after we give
>> up => crash
>>
>>
>> Problem is, I'm not sure whats the best way of resolving this.  I have
>> been able to resolve case 1 by not setting the NEEDREINVITE flag, and
>> directly failing to g711 if the ACK hasn't arrived yet, but that still
>> leaves me crashing in case 2.
>>
>> Other ideas I had were to remove the timeout (trust the channel driver to
>> report failure on re-invite transaction timeout)
>>
>> Alternatively I thought it might be good to always initialize the T.38
>> stack of span dsp just in case we find out later that we actually need to
>> use it.
>>
>> I would like to get your feed back on what's the best solution for this
>> issue
>>
>>
>>
> Hey Torrey -
>
> This sounds a lot like ASTERISK-21242 (
> https://issues.asterisk.org/jira/browse/ASTERISK-21242), which I was
> looking at this week as well. Ashley's patch is to just always initialize
> the T38 stack to avoid those kinds of problems. It appears to have resolved
> the crashes he saw; it may be worthwhile trying that patch to see if it
> resolves the two scenarios you're seeing as well.
>
> The part I wanted to look into still was what impact the initialization
> has if we do choose to always initialize it. The other option would be to
> try to handle the various off nominal cases in the T.38 state machine
> handling; I have a feeling, however, that to do that would be rather
> difficult.
>
> Matt
>
> --
> Matthew Jordan
> Digium, Inc. | Engineering Manager
> 445 Jan Davis Drive NW - Huntsville, AL 35806 - USA
> Check us out at: http://digium.com & http://asterisk.org
>
> --
> _____________________________________________________________________
> -- Bandwidth and Colocation Provided by http://www.api-digital.com --
>
> asterisk-dev mailing list
> To UNSUBSCRIBE or update options visit:
>    http://lists.digium.com/mailman/listinfo/asterisk-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20131112/c7ca4dd9/attachment.html>