[asterisk-dev] RTP streams suddenly stop

Tony Mountifield tony at softins.clara.co.uk
Thu Feb 4 07:39:25 CST 2010


I'm posting this to asterisk-dev because I am certain that I will need
to dive into the code to identify and fix this problem. However, I'm not
yet sure where to look, so would be grateful for some ideas!

The system in question talks SIP to an ITSP and is installed on their
LAN in a colocated rack. It also has a couple of SIP phones dial into it
over the Internet.

Every so often (once every week or two), the users complain of being cut
off, and when dialling back in cannot hear any audio. This persists for
several minutes before magically fixing itself.

I have been running a continuous SIP trace using tcpdump -w for offline
analysis, but this didn't show anything useful. I decided it was not
practical to take tcpdump traces of the RTP streams, due to the volume
of data involved.

So I wrote a monitor for the RTP streams using libpcap, which would keep
track of when the RTP streams started and stopped, and also look for
anomalies in the timestamps and sequence numbers.

The problem occurred again this morning, and what the monitor showed me
was this:

a) At about 300ms after 10:31:35, all the active RTP streams from
asterisk to the ITSP and the two SIP phones stopped simultaneously. The
streams into Asterisk continued, and the ITSP started sending RTCP
enquiries.

b) For the next five minutes, as people tried calling, every stream out
of Asterisk lasted exactly one packet before stopping.

c) After about five minutes, the problem magically fixed itself and
audio streams started to flow normally.

There is nothing in /var/log/asterisk/full at the times in question; it
has no entries between 10:31:18 and 10:31:44. I can't see anything
relevant in the rest of the system either.

So, why would all RTP streams stop at once?

The underlying OS is CentOS 4.7.

Zaptel is 1.2.27 with ztdummy compiled with USE_RTC.

The version of Asterisk is 1.2.32 with some custom modifications. The
most relevant modification might be that I have added the internal
timing feature from https://issues.asterisk.org/view.php?id=5374
However, I have included this in all systems for the last four years
without any trouble till now.

Some advice on this would be REALLY welcome, as I must fix it urgently.

Cheers
Tony
-- 
Tony Mountifield
Work: tony at softins.co.uk - http://www.softins.co.uk
Play: tony at mountifield.org - http://tony.mountifield.org



More information about the asterisk-dev mailing list