[asterisk-dev] res_fax_spandsp segfaults during fax detection

Michal Rybárik michal at rybarik.sk
Tue Jan 28 16:03:46 CST 2014


Hi Pavel,

> Hi Matthew,
>    thanks for another part of the mosaic, it seems to be more and more
> complete and starts forming a picture :-).
>    Would it be possible, that the peer causing the segfault uses 10ms
> packetization time instead of 20ms, which is required for correct operation
> by libspandsp ? What does Asterisk in cases, when the peer allows 10ms
> packetization only, while we are using 20ms ? Will it do a conversion,
> catching two packets and presenting them as one bigger, or will it just
> forward the short packets to the application ? If the second answer is
> correct (which I belive is true), I think we have a candidate case for the
> crash - it's caused by someone who sends 10ms packets instead of 20ms ones.

My setup (which has segfaults) is VERY simple, 90% of calls flows 
between two machines, this way:
(E1 / DAHDI) ------- Asterisk11 ------- (SIP+RTP) -------- Asterisk11 
------- (E1 / DAHDI)
Configuration is very simple too - only alaw is allowed in SIP (DAHDI is 
also alaw in our country, so there shouldn't be any transcoding, except 
initial t38gateway period when transcoding to sln is needed, AFAIK). 
Every frame in my setup should be 20ms long, I shouldn't have any 10ms 
frames here. If I look at the frame which caused last segfault, I see 
that len=20, and this means frame length in milliseconds. This 
corresponds to samples=160 which was set in this frame, because with 
8000Hz sampling frequency, we should have exactly 160 samples during 
20ms - so I think the bad frame was also 20ms long.

[Jan 27 14:00:32] NOTICE[30694][C-000006cb] res_fax_spandsp.c: frame={
frametype=2, datalen=160, samples=160, mallocd=1, mallocd_hdr_len=562,
offset=64, src=RTP, flags=1, ts=9140, len=20, seqno=1489,
data.ptr=0xb4ef4f30  }


>    So, yes, the pcap of such a failure would be really great! As I wrote,
> I can't generate one, because in my environment, the crash is really VERY
> rare and the pcap files would probably fill the whole disk and didn't catch
> anything :-). Maybe Michal will have more luck with this ?

Ouch :) Capturing all RTP traffic won't be easy, if I don't want to 
affect machine operation... Best way would be to mirror traffic at the 
switch to another port, and then do pcap on another dedicated machine... 
It is possible, but my devices are at the opposite side of country, and 
this is not trivial to manage ;) I'll keep this for later days, if we 
won't succeed using less brute-force ways :)

BTW, I did an quick & ugly patch yesterday and I'm testing it now.. I 
know that it was (probably) the first RTP frame in the call, which 
caused last segfault. And only this one frame has datalen=160 and 
mallocd=1. So I am skipping frames which has such values set - I'm not 
passing them into libspandsp. I see from logs, that I was right, and 
only first frame in call is filtered from V21 detection, so this doesn't 
break normal operation. In a few days I'll see, if this helped, or not. 
For sure, this is not proper fix, but I hope it'll help while debugging....

--
Michal Rybarik




More information about the asterisk-dev mailing list