[asterisk-users] AMI timeouts

Alexander Frolkin avf at eldamar.org.uk
Mon Jul 15 07:17:05 CDT 2013


Hi,

> >  1. Java process sends a request (e.g., add member to queue)
> Do you see the TCP ACK coming back from Asterisk?

Yes, I do.

> During the quiet period while you're waiting for the response, do you
> receive events over that AMI connection?

Yes.

> Are there other actions that you're attempting to execute?

In the particular case I'm looking at right now, yes (there's a
QueuePause action followed closely by an Originate action).

> Is there any consistency as to which commands are getting delayed?

Here's a breakdown from the last two weeks:

Timeout Total   %       Command
~~~~~~~ ~~~~~   ~       ~~~~~~~
    178  7478    2.38   Command
      5  1870    0.26   DBDel
    804 13549    5.93   QueueAdd
   2894 55621    5.20   QueuePause
    660 13856    4.76   QueueRemove

So it appears that most of the delays are from the queue module, which
is understandable, because that's doing most of the work in our set-up.

> There are any number of reasons why the response would be delayed, but
> the >25 seconds delay you're seeing is excessive for any of the
> reasons I can think of.

It turns out the timeout in the Java app is set to only 3 seconds, not
5, like I said in my previous email.

What would be a reasonable delay time?  In the case I'm looking at right
now, the longest I can see is 7.2s.

Looking in the Java app logs, I can see it occasionally (166 times over
the last two weeks) timing out after five retries, which means it failed
to get a response to any of the retries within three seconds.

> Packet loss could cause delays in getting responses, but
> usually not for the lengths of times you're talking about.

There's nothing in the packet capture to indicate packet loss.

Perhaps I should mention another issue that we've seen (and worked
around) previously.  Our Asterisk uses ODBC to talk to an Oracle
database for realtime peers, for func_odbc and, back then, for CEL.
The issue was that when there was a job running against the database
which caused it to slow down, Asterisk dropped calls with the message
"no reply to our critical packet".  As soon as we changed the database
job to run at night (when the call centre is closed), this problem went
away.  It feels like Asterisk was stuck waiting for the database and
missed the critical packets when they were, in fact, there.


Thanks for your help!


Alex




More information about the asterisk-users mailing list