[asterisk-bugs] [JIRA] (ASTERISK-24555) Memory usage increases constantly (frame.c memory cache)

Mark Michelson (JIRA) noreply at issues.asterisk.org
Mon Dec 1 09:53:29 CST 2014


    [ https://issues.asterisk.org/jira/browse/ASTERISK-24555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=223778#comment-223778 ] 

Mark Michelson commented on ASTERISK-24555:
-------------------------------------------

I have to admit that frame cache internals and chan_iax2 are not two of my strongest points in the Asterisk code-base, but I may be able to shed a bit of light on things here.

The frame code operates by having a thread-local cache of up to 10 frames (unless a compile-time constant is changed to a different value). This way, when a thread wants to duplicate a frame, it may be able to grab a cached structure instead of having to allocate a new one. Similarly, when a frame is freed, if there is room in the cache, then the frame is not freed but instead added to the cache. The cache is limited to 10 frames, so any thread that processes frames could presumably "leak" 10 frames since they are purposely not freed when requested. Since the cache is thread-local, a destructor function to free the frames is called when the thread is reclaimed.

Interestingly, taking a glance at chan_iax2 code, it appears that the iax2_process_threads are only reclaimed when the module is unloaded. By default, the max thread count is 100 (and can be configured up to 256 if desired). If each of these threads is queuing frames, then you could end up with 10 * 100 = 1000 cached frames, or 10 * 256 = 2560 cached frames. The problem is, even with those numbers, the numbers you're seeing are much larger than the memory that would be allocated by 2560 frames. Plus, as you have shown in your "memory show summary" output, there are apparently over 5 million allocations in frame.c that are contributing to the count.

My guesses here are either:
1) There are lots of threads that are queuing frames beyond just the chan_iax2 threads, and these threads are not being reclaimed. If you connect to your running Asterisk process with gdb and issue a "thread apply all backtrace" command, then you should see a list of all running threads. If the number of threads is ridiculous, then this may be a reasonable explanation of what's happening, and looking into why threads are not being reclaimed would be worthwhile.
2) Somewhere in the code path, a frame is being duplicated without a corresponding call to free it. In this case, the frame cache code isn't really the issue. The problem is that there's a straight memory leak that needs to be fixed up.

My guess here is that number 2 is the more reasonable explanation, especially given the rate at which the memory is being allocated. As you've noted on the -dev list though, discovering this sort of thing can be tricky.

> Memory usage increases constantly (frame.c memory cache)
> --------------------------------------------------------
>
>                 Key: ASTERISK-24555
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-24555
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Channels/chan_iax2, Core/General
>    Affects Versions: 11.14.0
>         Environment: Ubuntu 11.10 x64
> Kernel 3.0.0-32
> cat /proc/cpuinfo:
> $ cat /proc/cpuinfo 
> processor	: 0
> vendor_id	: GenuineIntel
> cpu family	: 6
> model		: 15
> model name	: Intel(R) Xeon(R) CPU           X3220  @ 2.40GHz
> stepping	: 7
> cpu MHz		: 2400.074
> cache size	: 4096 KB
> physical id	: 0
> siblings	: 4
> core id		: 0
> cpu cores	: 4
> apicid		: 0
> initial apicid	: 0
> fpu		: yes
> fpu_exception	: yes
> cpuid level	: 10
> wp		: yes
> flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm dtherm tpr_shadow
> bogomips	: 4800.14
> clflush size	: 64
> cache_alignment	: 64
> address sizes	: 36 bits physical, 48 bits virtual
> power management:
> processor	: 1
> vendor_id	: GenuineIntel
> cpu family	: 6
> model		: 15
> model name	: Intel(R) Xeon(R) CPU           X3220  @ 2.40GHz
> stepping	: 7
> cpu MHz		: 2400.074
> cache size	: 4096 KB
> physical id	: 0
> siblings	: 4
> core id		: 1
> cpu cores	: 4
> apicid		: 1
> initial apicid	: 1
> fpu		: yes
> fpu_exception	: yes
> cpuid level	: 10
> wp		: yes
> flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm dtherm tpr_shadow
> bogomips	: 4800.18
> clflush size	: 64
> cache_alignment	: 64
> address sizes	: 36 bits physical, 48 bits virtual
> power management:
> processor	: 2
> vendor_id	: GenuineIntel
> cpu family	: 6
> model		: 15
> model name	: Intel(R) Xeon(R) CPU           X3220  @ 2.40GHz
> stepping	: 7
> cpu MHz		: 2400.074
> cache size	: 4096 KB
> physical id	: 0
> siblings	: 4
> core id		: 2
> cpu cores	: 4
> apicid		: 2
> initial apicid	: 2
> fpu		: yes
> fpu_exception	: yes
> cpuid level	: 10
> wp		: yes
> flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm dtherm tpr_shadow
> bogomips	: 4800.18
> clflush size	: 64
> cache_alignment	: 64
> address sizes	: 36 bits physical, 48 bits virtual
> power management:
> processor	: 3
> vendor_id	: GenuineIntel
> cpu family	: 6
> model		: 15
> model name	: Intel(R) Xeon(R) CPU           X3220  @ 2.40GHz
> stepping	: 7
> cpu MHz		: 2400.074
> cache size	: 4096 KB
> physical id	: 0
> siblings	: 4
> core id		: 3
> cpu cores	: 4
> apicid		: 3
> initial apicid	: 3
> fpu		: yes
> fpu_exception	: yes
> cpuid level	: 10
> wp		: yes
> flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm dtherm tpr_shadow
> bogomips	: 4800.18
> clflush size	: 64
> cache_alignment	: 64
> address sizes	: 36 bits physical, 48 bits virtual
> power management:
>            Reporter: James Lamanna
>
> The memory allocation of Asterisk 11 seems to constantly increase.
> Specifically, the cache allocated in frame.c seems to increase without bound. I haven't let it go this far, but it seems it would likely max out system memory and be killed by the OOM killer.
> $ asterisk -rx "memory show summary" | sort -rn
> 3950471103 bytes allocated (3923172359 in caches) in 5289609 allocations
> 3922949255 bytes (3922882279 cache) in    5184358 allocations in file frame.c
> ..
> $ asterisk -rx "core show channels"
> 8 active channels
> 4 active calls
> 14745 calls processed
> $ asterisk -rx "core show uptime"
> System uptime: 2 days, 14 hours, 17 minutes, 32 seconds 
> Last reload: 2 days, 14 hours, 17 minutes, 32 seconds 



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list