[asterisk-bugs] [JIRA] (ASTERISK-25905) Memory leak during perf testing

Richard Mudgett (JIRA) noreply at issues.asterisk.org
Wed Apr 13 22:21:56 CDT 2016


    [ https://issues.asterisk.org/jira/browse/ASTERISK-25905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=230240#comment-230240 ] 

Richard Mudgett commented on ASTERISK-25905:
--------------------------------------------

In my setup, I was able to reproduce what seems to be the same thing. The output of “core show taskprocessors” demonstrates that the CDR engine is working off a backlog of tasks due to the continuous stream of calls from the stress test. Working off that backlog can take quite some time once the bombardment of calls stops. Unfortunately, “core show taskprocessors” only gives meaningful task processor names in newer Asterisk versions, whereas it’s a lot of meaningless UUID names in v13.1-cert.

Once the CDR engine has worked off the backlog, the memory reported by MALLOC_DEBUG goes back to a normal level.  However, the memory is not given back to the operating system by glibc's allocator.  The glibc allocator does not normally return memory to the OS. So when Asterisk gets in this situation and consumes a large portion of the system memory, the memory is not returned to the OS until asterisk is restarted. There is a malloc_trim() function specific to glibc that tells the allocator to return free memory at the end of the heap back to the OS.  For testing purposes, I created a temporary CLI command to invoke the malloc_trim() function. My test machine has 3GB of memory. After a stressful pounding that grabbed 62% of available memory, malloc_trim() gave back all but 2.6% to the OS when all the channels and taskprocessor backlog were gone. The 2.6% is about the expected size of the heap before the backlog started to explode. The size of the memory released shows that there isn't a systemic memory leak as a leak like that would cause severe heap fragmentation and prevent the release of that much memory.

Please note that MALLOC_DEBUG does have a performance penalty.  The backlog shown in the task processor queues will go down faster when MALLOC_DEBUG is disabled.

# There is no memory leak.  All that extra memory is being consumed by the backlog of tasks in the task processors.  Once the backlog clears, the memory usage goes back to normal.  But the memory isn't returned to the OS by the glibc allocator.
# It is certain though that Asterisk is being delivered a higher number of calls than it can handle for a sustained period. This is not a particular fault with Asterisk.

> Memory leak during perf testing
> -------------------------------
>
>                 Key: ASTERISK-25905
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-25905
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Applications/app_confbridge, pjproject/pjsip
>    Affects Versions: 13.8.0
>         Environment: Red Hat Enterprise Linux Server release 7.2 (Maipo)
> Linux ykt1cfbprd1 3.10.0-327.13.1.el7.x86_64
> certified/13.8-cert1-rc1
> pjproject-2.4.5
>            Reporter: Robert McGilvray
>            Assignee: Richard Mudgett
>            Severity: Minor
>         Attachments: loadtest.txt, memory-summary.txt
>
>
> ** I've been testing against the certified branch, last cloned yesterday with certified/13.8-cert1-rc1. It would not allow me to select that as a version however ** 
> While using sipp as a generator to load test Asterisk I've come across a memory leak that very quickly exhausts the host of resources. 
> The testing methodology is pretty simple: use sipp to launch 1500 concurrent calls to asterisk with a call rate of 25/sec. On the asterisk side use the RAND function to generate two numbers, one of which is the confbridge number and the other (either 0 or 1) is to determine whether to use the moderator profile or participant. The call is then dropped into a ConfBridge for 60s and Hungup. 
> After a few thousand completed calls the memory usage grows and eventually exhausts the host resources. I recompiled with MALLOC_DEBUG enabled, the output of memory show allocations is attached. It looks like the allocations are in stasis_channels, well after all channels have been disconnected. 
> {noformat}
> ykt1cfbprd1:/home/netops# ps -C asterisk u
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> asterisk 20927  113  9.2 6292308 3029204 ?     Sl   16:32  24:47 /home/asterisk/asterisk-cert-13.8/sbin/asterisk -f -C /home/asterisk/asterisk-cert
> root     32052  0.0  0.0  47428  2840 pts/0    S+   16:43   0:00 rasterisk risk/asterisk-cert-13.8/sbin/asterisk -r
> {noformat}
> Please let me know if you need any further information.
> Thanks!!



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list