[Asterisk-Dev] Integration of AGI and Management API

Thu Aug 12 08:18:27 MST 2004

We have the luxury(or curse depending on your point of view) of having over
100 phones connected to Sipura SIP adapters and another few dozen other SIP
phones, and when we do load tests we will actually use live SIP calls from
SIP devices. Another way to simulated traffic (if you have a quad T1 card)
would be to loop two T1 ports into the other two T1 ports and use the
manager to dial out one Zap channel that would go into another Zap channel
as an incoming call. This would at least give you 96 Zap channels of load
that would help you do some basic load testing. You may be able to do the
same thing with SIP if you use a machine as a port forwarder and have it
send SIP calls back to the Asterisk server, but I'm not exactly sure how you
would set that up.

As for where the actual bottleneck that locks the manager up is, I really
don't know. The last time I tried to figure it out I had full debugging
turned on with 50 SIP channels running and I ended up with hundreds of lines
of debug output per second and I never was able to figure out exactly what
caused it except for it seemed to be something to do with an Asterisk
routine being run on all SIP channels at the same time intermittantly. At
that point I made my Queue system a lot more fault tolerant and I haven't
had any problems since. 

And I don't think I mentioned it yet, but our asterisk servers run on P4
2.6GHz with 2GB RAM. We are going to try using a Prescott P4 3.2GHZ (1MB
cache) next week, I'll post if there are any significant differences between
the two setups.

Well, I'm in the Tampa Bay area of Florida and we have a hurricane coming,
so I won't be able to stay on this thread for the next few days. I'm off to
the store to get supplies and I have to get ready to board up the windows.

Let me know how the testing goes,

MATT---

-----Original Message-----
From: David Pollak [mailto:dpp-asterisk at projectsinmotion.com]
Sent: Wednesday, August 11, 2004 6:11 PM
To: asterisk-dev at lists.digium.com
Subject: Re: [Asterisk-Dev] Integration of AGI and Management API

Matt,

Thanks for your very clear, well reasoned comments.  

mattf wrote: 

OK, I think I understand what you're shooting for a little better now,
you're looking to have a single app server act as a dynamic remote dialplan
for 25 asterisk servers. I like the idea, but I'd really like to see what
your results are running it on a loaded server(with 100 SIP channels
running).

Can you point me to a load generator?  Being able to simulate the 100 call
load would be a GoodThing (tm)

The reason I asked if you had load tested is because the manager will spit
out a large amount of linear output even at just 30 channels of concurrent
use. At 100 channels and with passing and parsing more data because of AGI,
will your app server be able to keep up with all of the data coming in while
also sending commands out and keeping track of all of it? And will the
Asterisk server be able to keep up with no significant manager output
pauses?

My app server can handle the traffic.  The socket read thread just reads the
message from the socket, puts the message in a queue that's handled by a
worker thread.  The socket read threads can be prioritized over the worker
threads.  Keeping the socket read buffer clear is not a problem.

Here are some numbers to look at:
Using the example of a calling card AGI app. you have the customer dial in,
enter their phone number as a passcode, then enter the phone number they
want to call. let's say that for load testing's sake we have 100 people dial
in to use the calling card program at once. in the first 3 seconds you will
receive 3000 lines of manager output and have to send out at least 1500
lines of manager commands and receive back another 600 lines to play a sound
and prepare to collect DTMF input. Then for each of the 20 DTMF signals sent
for each of those channels you will have to receive 5 lines output and send
5 lines input for each of the 100 channels(another 5000 lines of output and
5000 lines of input). Then you will initiate the new call by Redirecting the
SIP channel to an outbound channel creating 10 more output lines and 6 more
input lines each. and when the calls terminate you will have another 26
lines of output. So for the first 20 seconds at least you are looking at
15,000 lines of input/output for the manager interface at a bare minimum.
This doesn't seem like much data when you compare it to a database or a web
server, but moving 300,000 Bytes through the manager interface in 20 seconds
is a rather large amount because it has not been optimized to handle it. In
this scenario you may even run into the buffer limiter built into the
manager to prevent kernel panics, which would result in non-delivered data.

Yep. I looked at the code.  If the write blocks for more than 1 ms, the
message is thrown away.  I don't think the Asterisk write/App Server read
socket will block (see the comments above.)

I did a test sending MAGI commands from the App Server to Asterisk.  I was
able to sustain 500 messages per second (500 AGI no-op commands) sent to
Asterisk with the response sent back to my app server.  There was no
significant CPU utilization indicating that the parsing of the text buffers
was not taking much time.  I sustained the 500 messages per second over 60
seconds with no lost traffic.  That's about 10K bytes per second through the
Manager.  The seems like a reasonable amount of traffic.  Granted, the
traffic was not on a loaded machine, but given the lack of CPU utilization,
I think it should pass the tests you've outlined above.

Another thing to consider is the SIP channel system load spikes that occur
on Asterisk. At intermittant times(anywhere from 5 minutes to 2 hours
apart), Asterisk will perform some actions on running SIP channels all at
once. If you are running more than 24 channels at the time this happens you
will see the system load usage spike and the manager output will pause for
an amount of time proportional to the number of active channels you have
going(anywhere from 1-15 seconds on my systems). When the connection
unpausess all of the commands sent to the manager while it was paused will
all be executed at once and you will receive a flood of manager output data
really fast. Now this does not affect the dialplan operation but it does
mess with the manager. I have been able to reproduce this and have proven
that this does happen to SIP channels even in a controlled lab environment.

Can you send me a way to reproduce this situation?  Looking at the Manager
code, I can't seem to find a place in the manager that would block based on
what's happening on SIP channels.  Have you reproduced this under the
debugger and looked at what's blocking where?

Your suggestion is a great idea for how to have Asterisk function more
easily in a larger environment, and if it works on loaded systems it is
something that I would love to use on my systems. But I think you need to do
some load testing and see what the Asterisk Manager can handle before you
put more effort into finalizing an application that may not work on the
scale you are planning on.

Once again, thanks for your well reasoned comments.  Once I get a load
tester, I'll try running the puppy with 100 SIP channels (maybe all running
meetme) and then send a ton of MAGI traffic and see if I can get the 500
commands per second.

Thanks,

David

MATT---

-----Original Message-----
From: David Pollak [ mailto:dpp-asterisk at projectsinmotion.com
<mailto:dpp-asterisk at projectsinmotion.com> ]
Sent: Wednesday, August 11, 2004 2:08 PM
To: asterisk-dev at lists.digium.com <mailto:asterisk-dev at lists.digium.com> 
Subject: Re: [Asterisk-Dev] Integration of AGI and Management API

Matt,

I think we agree on the goals:

*	To minimize the number of threads and sockets on the machine running
Asterisk to maximize the performance of the machine running Asterisk 

*	To minimize the changes to Asterisk, but maximize the flexibility of
Asterisk 

*	To minimize the number of open Manager API connections 

My proposal (and the code that I have already developed) allows the posting
of AGI commands from the Manager API to a given channel and sending the
response code for the commands from res_agi.c to the Manager.  The code
simply pipes AGI commands from a source other than a pipe that's dedicated
to a single AGI session.

Put another way, AGI is a lot like early CGI on web servers.  Over time,
systems like mod_perl, etc. were developed to reduce load on systems.  What
the merger between Manager API and AGI does is allows a single manager
connection to manage the execution of applications on an arbitrary number of
channels.

mattf wrote:

When you say 100 concurrent channels do you mean 50 SIP phones connected to

50 Zap channels or 100 SIP phones in conversations with 100 other end

points?

The initial system that we're building primarily performs IVR.  All of the
channels will be terminated SIP calls.

Have you done any load testing ? What kind of system configuration are you

planning on?

Each of the Asterisk servers will be P4 3.x Ghz systems running Intel's 875
chipset.

How exactly would you go about "creating" a calling card AGI through the

manager?

Sending (or piping) the AGI commands through the Manager API so that they
are posted to a Channel.  The exec_agi() routine in res_agi.c dequeues the
AGI command from the channel rather than reading it from a file descriptor.
Once the command is read or dequeue, it's executed and the results are sent
back to the fd or sent as a manager_event.

What's wrong with just using AGI scripts?

Please see my original posting.  With AGI scripts, there's a seperate
process created for each AGI that's executed.  Hooking the AGI process to my
central J2EE server means 2 open sockets for each running AGI.  Monitoring
100 channels on 25 machines means 2,500 open sockets to my J2EE server.  On
the other hand, sending AGI commands via the Manager API means only 25 open
sockets for my J2EE server and only 1 open connection from the Asterisk
instance to my server.

A busy Asterisk server with 100 concurrent manager connections will not

work, especially if you want to actually use those manager connecitons for

anything.

I think you're missing the point.  There will only be 1 open manager
instance per Asterisk instance.  That single manager instance will control
all the open channels.

Attempting to run interactive programs like AGIs through the manager API

would complicate the managerAPI by adding many new inputs, outputs and

parsing rules to it and would slow it down even more, not to mention the

fact that you are depending on that client socket connection to keep flowing

perfectly in order to run your phone system and I can tell you that the

manager API is not fault tolerant enough to handle that kind of volume,

complexity and data-delivery-assurance with any kind of reliability.

Nope.  Here's the total change to manager.c (okay, there's one other change
where the action_magi function is registered, but that's an additional line
in an array of functions):
static int action_magi(struct mansession *s, struct message *m)
{
  char *command = astman_get_header(m, "Command");
  char *channel = astman_get_header(m, "Channel");
  char *id = astman_get_header(m, "Uniqueid");

  if (!command || ast_strlen_zero(command)) {
    astman_send_error(s, m, "Command not specified");
    return 0;
  }

  if (!channel || ast_strlen_zero(channel)) {
    astman_send_error(s, m, "Channel not specified");
    return 0;
  }

  if (!id || ast_strlen_zero(id)) {
    astman_send_error(s, m, "UniqueID not specified");
    return 0;
  }

  struct ast_channel *theChan = ast_get_channel_by_name_locked(channel);

  if (!theChan) {
    astman_send_error(s, m, "Channel not found");
    return 0;
  }

  // allocate the structure here... it will be
  // free'ed if enqueuing fails or if 
  struct ast_magi *magi = ast_magi_new();

  if (!magi) {
    astman_send_error(s, m, "MAGI not allocated -- out of memory");
    ast_mutex_unlock(&theChan->lock);
    return 0;
  }

  // being smart, cmd is 1 char longer than MAX_AGI_CMD_LEN
  // so we strncpy, and then set the last char to 0
  strncpy(magi->cmd, command, MAX_AGI_CMD_LEN);
  magi->cmd[MAX_AGI_CMD_LEN] = 0;

  // see above re field lengths
  strncpy(magi->uniqueid, id, MAX_MAGI_UNIQUE_ID);
  magi->uniqueid[MAX_MAGI_UNIQUE_ID] = 0;

  // if the add fails, this routine cleans up
  // the magi
  int added = ast_add_magi_to_channel(theChan, magi, 0);

  ast_mutex_unlock(&theChan->lock);

  if (!added) {
    astman_send_error(s, m, "MAGI not added to channel");
    return 0;
  }

  ast_cli(s->fd, "Response: Success\r\n"
      "Uniqueid: %s\r\n\r\n",
      id);

  return 0;
}

All of the parsing for the commands is done is reg_agi and the commands are
treated exactly like the commands that came in from a pipe/file descriptor
that the current AGI monitors.

If you can find anything in this code that will reduce the stability of the
Manager or create a bottleneck, please let me know.

Thanks,

David

MATT---

-----Original Message-----

From: David Pollak [ mailto:dpp-asterisk at projectsinmotion.com
<mailto:dpp-asterisk at projectsinmotion.com> ]

Sent: Tuesday, August 10, 2004 10:58 PM

To:  asterisk-dev at lists.digium.com <mailto:asterisk-dev at lists.digium.com> 

Subject: Re: [Asterisk-Dev] Integration of AGI and Management API

My application will initially have 25 Asterisk servers managed by 1

application server.  That number must be scalable to at least 500 Asterisk

servers.

Given the initial 25 servers, with 100 open channels each, that's 2,500

channels to manage.  

So, I've been looking over the Asterisk source and I had an idea... what 

if there was a merger of the Manager API and AGI.  Here's specifically 

what I was thinking:

It looks like you've already done that above.  You already

started using the AGI and the AMI, so that's where the

merger is.  I could be wrong on this, but the rest of your

message concerns how to put these two things back together.

So you are making a circle, when a straight line would do.

I have not found a straight line from the Manager API that allows the

execution of an application on a given channel with all the normal responses

(e.g., waiting for DTMF, etc.)  If you can show me how to do that, I'd

really appreciate it.

Yep... I can see originating a call to an application, but I cannot see how

to respond to an existing channel.  For example, how would one create the

Calling Card sample AGI application by originating a call?

To my mind, all AGI scripts represent dynamic dial plans.  I haven't done

the full analysis, but it seems that dial plans themselves are Turing

complete.  However, it's really, really hard to build complex applications

using dial plans.  Thus, AGI script allow for a better language to write

dynamic dial plans.  My changes are simply to allow the AGI commands to be

sent to a channel via the Manager API rather than via a pipe to a separate

process.  In res_agi.c, the next command is recalled from the Channel rather

than by reading from the pipe.  That's the fundimental difference.

I don't think that works.  I think the AGI was added to Asterisk because the

ability to control a channel via the Manager API is limitted.  My changes

have simply added a new way in which an AGI script can send commands to a

Channel that's expecting AGI commands.

Thanks,

David

_______________________________________________

Asterisk-Dev mailing list

Asterisk-Dev at lists.digium.com <mailto:Asterisk-Dev at lists.digium.com> 

http://lists.digium.com/mailman/listinfo/asterisk-dev
<http://lists.digium.com/mailman/listinfo/asterisk-dev> 

To UNSUBSCRIBE or update options visit:

    http://lists.digium.com/mailman/listinfo/asterisk-dev
<http://lists.digium.com/mailman/listinfo/asterisk-dev> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.digium.com/pipermail/asterisk-dev/attachments/20040812/c6c94d3f/attachment.htm