[asterisk-dev] RTP streams suddenly stop

Thu Feb 4 15:44:26 CST 2010

In article <9AE9624A4ABB634DB088271B5C82D04502D1CE5E2D at scl-exch2k7.phoenix.com>,
Dan Austin <Dan_Austin at Phoenix.com> wrote:
> Tony wrote:
> 
> > There is nothing in /var/log/asterisk/full at the times in question; it
> > has no entries between 10:31:18 and 10:31:44. I can't see anything
> > relevant in the rest of the system either.
> 
> > So, why would all RTP streams stop at once?
> 
> > The underlying OS is CentOS 4.7.
> 
> > Zaptel is 1.2.27 with ztdummy compiled with USE_RTC.
> 
> > The version of Asterisk is 1.2.32 with some custom modifications. The
> > most relevant modification might be that I have added the internal
> > timing feature from https://issues.asterisk.org/view.php?id=5374
> > However, I have included this in all systems for the last four years
> > without any trouble till now.
> 
> > Some advice on this would be REALLY welcome, as I must fix it urgently.
> 
> Does this box have NTP running, and is it perhaps drifting out of sync then
> being corrected by a large (relative) value?  I recall seeing a note either
> on the issue track on SVN logs about a recent fix to make internal_timing
> less sensitive to time changes.

It does have NTP, but keeps the time in sync without doing stepwise changes.

Actually, I think the problem might have been something I did to channel.c
a long time ago. Back in January 2008 I posted a quite detailed technical
question about CHECK_BLOCKING, and then a follow-up to it:

http://lists.digium.com/pipermail/asterisk-dev/2008-January/031529.html
http://lists.digium.com/pipermail/asterisk-dev/2008-January/031537.html

Unfortunately, I never had ANY response to either of those messages
(I don't often post, but when I do it's usually something deep and tricky
and I seldom get a response, which is disappointing).

Based on my understanding at the time, I had commented out some of the
calls in ast_write() to CHECK_BLOCKING() and the matching clear of the
flag AST_FLAG_BLOCKING. Since another box which is much busier than the
affected one did not have these changes, and does not exhibit the problem,
I think it's quite possible these changes were wrong (if so, there must
be a more correct way to correct my original 2008 problem). My guess is
that chan->blocker was not set at some point when it was needed, and so
a thread that needed waking with SIGURG was not being woken.

I have restored the commented-out calls and will now have to wait to see
whether the RTP problem recurs or not.

I see the code in question is still the same in SVN trunk.

I would be very interested if anyone was able to comment on this, and
especially in any comments, however belated, on my original two messages
referred to above.

Thanks,
Tony

-- 
Tony Mountifield
Work: tony at softins.co.uk - http://www.softins.co.uk
Play: tony at mountifield.org - http://tony.mountifield.org