[asterisk-dev] AMI Disconnect/Sudden "Asterisk Call Manager/1.3" received

Tue May 6 12:13:02 CDT 2014

On 05/06/2014 10:56 AM, Olle E. Johansson wrote:
> On 06 May 2014, at 16:11, Daniel McFarlane <daniel at szeto.ca> wrote:
>
>> Hi All,
>>
>> I've been working with Asterisk for 2.5 years now but I am new to the mailing list.  Hoping to get some answers here..
>>
>> I've written a piece of software to control Asterisk via AMI.  I'm able to login, making outbound calls, receive/record calls, handle digits, put people into conferences, etc.  Everything seems to work well, at a low to medium call volume.
>>
>> I've been stress testing the system and found what seems to be a memory issue within Asterisk.  However, after over a day of searching I can't even find any bug reports or anything to bring me closer to a solution..but still hoping to find a solution or that someone can help me find one..
>>
>> I connect to the AMI successfully.  In my test I am making an outbound call through a sip "device" I configured within sip.conf. This sip "device" basically connects back into the same Asterisk I am using to generate the calls.
>>
>> If I generate outbound calls (i.e.: Via Action: Originate) using a low call volume everything seems to be fine.  e.g.: I can generate 10 calls and they all go through and complete successfully (with a total of 20 calls bring processed by the same Asterisk, due to both outbound and inbound calls).
>>
>> The problem arises If I bombard (i.e.: Write all the Originate requests within about 7 seconds) the AMI to generate 97 outbound calls (for a total of 194 channels).  The first Originate commands seem to get processed,  but then all of a sudden (after Asterisk seems to have started processing a good amount of my Originate requests already) Asterisk seems to reset the AMI interface! Without even having a dropped connection I receive a new "Asterisk Call Manager/1.3" string and then previous commands that I issued start coming back with Response: Error and Message: Permission denied.
>>
>> I've enabled debugging and here's the security log output when I connect:
>>
>> [Apr  1 12:43:25] SECURITY[9877] res_security_log.c: SecurityEvent="SuccessfulAuth",EventTV="1396370605-551714",Severity="Informational",Service="AMI",EventVersion="1",AccountID="username",SessionID="0x7f0c540d96b8",LocalAddress="IPV4/TCP/0.0.0.0/5038",RemoteAddress="IPV4/TCP/192.168.1.62/52139",UsingPassword="0",SessionTV="1396370605-551710"
>>
>> (This is the time when I am issuing commands, it is accepting and processing them)
>>
>> Then all of a sudden (when I receive a new "Asterisk Call Manager/1.3" string and commands are rejected with "Permission denied") here is what the log shows:
>>
>> [Apr  1 12:44:36] SECURITY[9877] res_security_log.c: SecurityEvent="RequestNotAllowed",EventTV="1396370676-876795",Severity="Error",Service="AMI",EventVersion="1",AccountID="",SessionID="0x7f0c54072968",LocalAddress="IPV4/TCP/0.0.0.0/5038",RemoteAddress="IPV4/TCP/192.168.1.62/52144",RequestType="Action: Originate",SessionTV="0-0"
>>
>> (Note: How the AccountID is now empty)
>>
>> At this same time the CLI shows:
>>
>>   == Manager '*****' logged off from 192.168.1.**
>>
>> ..yet it wasn't my application that disconnected and/or issued an Action: logoff to Asterisk (my application doesn't even detect a connection drop, so it seems like Asterisk just logs the manager user out!).
>>
>> Why did Asterisk all of a sudden reset itself and/or lose reference to the user I am authenticating with?
>>
>> Note: My maxcalls setting within asterisk.conf was set to default. I even tried upping it to 200 and that didn't help.  My maxload setting was untouched/commented out and therefore set to default (no threshold).  I still tried upping it to 6.0, 8.0 and 10.0 to no avail.  I even uncommented minmemfree and set it to 256 (although calling free -m from a terminal showed their to always be plenty of memory on my system).
>>
>> Does anyone have an idea how to fix/correct this?
> One thing you have to be aware of is that Action:s are atomic. You have to wait for the response of one action before you send the next one. Otherwise they will all end up in the TCP buffer. That's propably what you see - a lot of queued up messages being red after the tcp connection is reset.
>
> That doesn't explain why something resets and send you a new connect message and possibly requires a new login. That seems like a bug that needs to be reported. Can you easily repeat this?
>
> Just my 5 öre. Anyone else with good comments on this?
> /O

Thank you for the quick responses.

First off, as Tony pointed out I did forget to include my Asterisk 
version.  I did realize it right after sending the first message, but 
figured I'd send it in my first response.  Sorry about that. I'm running 
Asterisk 11.2.1.

Olle: When associating with 1 audio channel I do wait for the response 
before issuing the next command.  In this case, however, my software 
internally treats each as a separate call (i.e.: As if I had issued 
Originate requests to separate DAHDI trunks, instead of a IAX/SIP `pipe` 
that handles all the SIP/IAX calls) so I do actually send multiple 
back-to-back Action: Originate commands without waiting for a response 
(but still wait for the OriginateResponse before issuing more commands 
for that same call, as necessary given that I need the specific callno 
from Asterisk before I can issue more commands).  I assumed that 
Asterisk would create a separate IAX/SIP entity/call for each Originate 
response (which it seems to do well under low volume, e.g.: 10 calls) & 
that this would therefore be ok.

Also, after for instance sending 6 Originate requests Asterisk does come 
back with:

e.g.:

Response: Success
ActionID: 819367539
Message: Originate successfully queued

..for each command I issued.  Is this not ok?  I can see how waiting to 
issue a response prior to to issuing another Action: Originate command 
would prevent a lot of TCP commands from being queued in the TCP buffer, 
but still seems like Asterisk should be handling it ok. Well, unless I 
overload it's queue or something..

You are right about the commands all ending up in the TCP buffer.  I 
also concluded that the "Permission denied" messages were the result of 
Asterisk trying to process from the TCP buffer, after it performed it's 
reset.

Yes, I can fairly easily reproduce this.  Here are some details to 
understand on how to reproduce:

1) I have noticed that if I say issue 100 calls in blocks of 10 calls 
every 2 seconds that Asterisk seems to be able to handle it..just not 97 
calls at one time.
2) Strangely, here's one I can't explain.  If I've been connected via 
the AMI port for some time and then start my tests it seems to be ok.  
If I connect, perform a test in blocks (as in #1 above) (kind of like 
warming it up) and then bombard with 97 calls at a time, it seems to 
hold up.
3) The easiest way for me to reproduce it is by disconnecting my 
software and reconnecting just prior to bombarding with 97 calls. Now 
I'm pulling at strings to explain this, but is it possible that Asterisk 
needs some set up time after login or treats this as a kind of spam 
attack?  Perhaps that could explain why it resets the connection?  I'd 
have to do more testing to see if I can get it to reset after being 
connected for a while..

Tony: I am certain it isn't my software performing or detecting a 
disconnect as I have logs to show such errors, however, I will perform a 
tcpdump as you suggested and let you know my results/answer the rest of 
your questions in a further email, once I have completed that test..

Thank You