[asterisk-dev] [Code Review] 3568: Reduce creation of channel snapshots to improve performance

Matt Jordan reviewboard at asterisk.org
Fri Jun 13 13:10:25 CDT 2014


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/3568/
-----------------------------------------------------------

(Updated June 13, 2014, 1:10 p.m.)


Status
------

This change has been marked as submitted.


Review request for Asterisk Developers.


Changes
-------

Committed in revision 5119


Bugs: ASTERISK-23811
    https://issues.asterisk.org/jira/browse/ASTERISK-23811


Repository: Asterisk


Description
-------

During some performance testing of Asterisk with AGI, ARI, and lots of Local channels, we noticed that there's quite a hit in performance during channel creation and releasing to the dialplan (ARI continue). After investigating the performance spike that occurs during channel creation, we discovered that we create a lot of channel snapshots that are technically unnecessary. This includes creating snapshots during:

* AGI execution
* Returning objects for ARI commands
* During some Local channel operations
* During some dialling operations
* During variable setting
* During some bridging operations

And more.

This patch does the following:
 - It removes a number of fields from channel snapshots. These fields were rarely used, were expensive to have on the snapshot, and hurt performance. This included formats, translation paths, Log Call ID, callgroup, pickup group, and all channel variables. As a result, AMI Status, "core show channel", "core show channelvar", and "pjsip show channel" were modified to either hit the live channel or not show certain pieces of data. While this is unfortunate, the performance gain from this patch is worth the loss in behaviour.
 - It adds a mechanism to publish a cached snapshot + blob. A large number of publications were changed to use this, including:
   - During Dial begin
   - During Variable assignment (if no AMI variables are emitted - if AMI variables are set, we have to make snapshots when a variable is changed)
   - During channel pickup
   - When a channel is put on hold/unhold
   - When a DTMF digit is begun/ended
   - When creating a bridge snapshot
   - When an AOC event is raised
   - During Local channel optimization/Local bridging
   - When endpoint snapshots are generated
   - All AGI events
   - All ARI responses that return a channel
   - Events in the AgentPool, MeetMe, and some in Queue
 - Additionally, some extraneous channel snapshots were being made that were unnecessary. These were removed.
 - The result of ast_hashtab_hash_string is now cached in stasis_cache. This reduces a large number of calls to ast_hashtab_hash_string, which reduced the amount of time spent in this function in gprof by around 50%.


Diffs
-----

  /branches/12/res/res_pjsip/pjsip_configuration.c 414971 
  /branches/12/res/res_agi.c 414971 
  /branches/12/res/ari/resource_channels.c 414971 
  /branches/12/main/stasis_channels.c 414971 
  /branches/12/main/stasis_cache.c 414971 
  /branches/12/main/stasis_bridges.c 414971 
  /branches/12/main/pickup.c 414971 
  /branches/12/main/pbx.c 414971 
  /branches/12/main/manager.c 414971 
  /branches/12/main/endpoints.c 414971 
  /branches/12/main/dial.c 414971 
  /branches/12/main/core_local.c 414971 
  /branches/12/main/cli.c 414971 
  /branches/12/main/channel_internal_api.c 414971 
  /branches/12/main/channel.c 414971 
  /branches/12/main/cel.c 414971 
  /branches/12/main/bridge_channel.c 414971 
  /branches/12/main/aoc.c 414971 
  /branches/12/include/asterisk/stasis_channels.h 414971 
  /branches/12/include/asterisk/channel.h 414971 
  /branches/12/apps/app_queue.c 414971 
  /branches/12/apps/app_meetme.c 414971 
  /branches/12/apps/app_agent_pool.c 414971 

Diff: https://reviewboard.asterisk.org/r/3568/diff/


Testing
-------

All sorts of different testing was done. For automated tests that cover these changes, see https://bamboo.asterisk.org/bamboo/browse/ATB-TEAMSP-6. Note that the test failures on that branch are currently also failing on the 12 branch/trunk. A few new tests were written and two Parking test's CDR results were tweaked (but not in a fashion that violates the spec or the intent of the records) - for those changes, see: https://reviewboard.asterisk.org/r/3578/

The first round of testing was profiling Asterisk with gprof to see if I could reduce the amount of time spent creating channel snapshots. This was tested using an ARI script that created a Local channel, subscribed to it, then released that channel to the dialplan. The released Local channel would Dial 10 or so Local channels, which went back into the Stasis application. This increased the number of Local channels rapidly on the system - letting this run for awhile (until hilarity ensued) was a good mechanism to test the affects of removing channel snapshots.

Once snapshot creation was no longer easily moved down the gprof list, I went ahead and did a bit of investigation on the effects of this patch on CPU using just the dialplan. The fact that snapshots were getting made a lot during AGI/ARI was clearly causing problems, but since dialplan execution still has to create a snapshot, I was curious what effect this patch would have on more "traditional" scenarios. For each of these rounds of tests, I let Asterisk run with SIPp for about 3000 channels. Watching top, I recorded the min/max that showed up during that period of time. Not exactly scientific - but it gives a general impression of the effect of the changes.

** Note that wherever I reference CPU, it should be noted that this was run on my laptop (which is rather old) while running Firefox, Pidgin, and a whole host of other stuff. I wasn't attempting to run a true performance analysis; instead, I was just trying to get a relative measurement of things with and without these changes. **

** ROUND 1 **

The first used an inbound PJSIP channel that had a variety of variables set on it, and then got hung up:

Dialplan:

exten => test_sip,1,NoOp()
 same => n,Set(var1=foo)
 same => n,Set(var2=bar)
 same => n,Set(var3=yackity)
 same => n,Set(var4=schmackity)
 same => n,Answer()
 same => n,Echo()

SIPp:

sipp 127.0.0.1:5060 -sn uac -i 127.0.0.1 -p 5061 -s test_sip -d 5000

With this patch, CPU was around 9-10%; without this patch, around 10-11%. So not much difference. This isn't too surprising: there's no dialling, there's no interfaces being used, and very few changes being made to the channel. Dialplan execution is the same with/without this patch - the largest 'savings' here is the usage of cached snapshots during variable assignment.

** ROUND 2 **

This round threw in a Local channel and Dial operation. There's a slightly larger improvement here with/without the patch:

Dialplan 2:

exten => test_dial_sip,1,NoOp()
 same => n,Set(var1=foo)
 same => n,Set(var2=bar)
 same => n,Set(var3=yackity)
 same => n,Set(var4=schmackity)
 same => n,Dial(Local/term at default)
 same => n,Echo()

exten => term,1,NoOp()
 same => n,Set(var1=foo)
 same => n,Set(var2=bar)
 same => n,Set(var3=yackity)
 same => n,Set(var4=schmackity)
 same => n,Answer()
 same => n,Echo()

With this patch, CPU was around 14-18%; without this patch, around 16-21%. Saving a few snapshots during Local channel "bridging", dialling, and setting variables on them shaves 2-3% off the CPU.

** ROUND 3 **

Same as round 2, only with a parallel Dial of multiple Local channels. Improvement was actually about the same as Round 2; with the patch, around 19-23% CPU, without the patch, around 22-26% CPU.

Dialplan 3:

exten => test_dial_sip,1,NoOp()
 same => n,Set(var1=foo)
 same => n,Set(var2=bar)
 same => n,Set(var3=yackity)
 same => n,Set(var4=schmackity)
 same => n,Dial(Local/term at default&Local/term at default&Local/term at default&Local/term at default&Local/term at default)
 same => n,Echo()

exten => term,1,NoOp()
 same => n,Set(var1=foo)
 same => n,Set(var2=bar)
 same => n,Set(var3=yackity)
 same => n,Set(var4=schmackity)
 same => n,Answer()
 same => n,Echo()


There's a few things to draw from all of this:
(1) Snapshots aren't free. If you can use a cached version, use it. If you can't, stage the snapshot until all attributes have been changed (which we do quite frequently; this patch did not need to modify any of that behaviour).
(2) Snapshots should be lean. Don't put something on there that doesn't need to be there. While there is benefit in not hitting live data to get information (reduces the interference of inspection commands in APIs on real time operations), it isn't worth hurting the performance of the overall system.
(3) We should pay attention to the cost of things in astobj2. Shifting the burden from creating snapshots to cached ones means the Stasis cache - which uses ao2 heavily - has to have good performance.
(4) It would be worthwhile to see if we can't optimize creation of snapshots further. Often, when we create a snapshot, we only modify one or two properties on it. Usage of memory pools, fixed sized fields with memcpy, and other techniques _may_ improve performance further.


Thanks,

Matt Jordan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20140613/e2681b5d/attachment-0001.html>


More information about the asterisk-dev mailing list