[asterisk-users] RTCP + Stasis causing high memory consumption

Patrick Wakano pwakano at gmail.com
Mon Nov 13 23:42:44 CST 2017

Hello Asterisk list,

I've facing a memory allocation issue that happens occasionally but on a
consistent basis.
The problem happens as follow, suddenly Asterisk starts consuming a lot of
memory, in a rate of more than 1GB per hour. Kernel will eventually kill it
via the OOM killer when memory is really exausted... This situation does
not generate backtrace because Asterisk is responsive during the whole time
and can be stopped/started.
After compiling Asterisk with the MALLOC_DEBUG flag, when the problem
happened we could see this output with the most consuming files, using
"memory show summary" command:
# cat memory_show_summary_201711101144.log | sort -nr | head
3060904798 bytes allocated (431496 in caches) in 16185877 allocations
1467180225 bytes in    1647238 allocations in file stasis_channels.c
1217035109 bytes in   10229320 allocations in file json.c
 240064732 bytes in    1765287 allocations in file stasis_message.c
  56402000 bytes in    1762250 allocations in file taskprocessor.c
  26125344 bytes in     203171 allocations in file rtp_engine.c
  17308827 bytes in     307848 allocations in file stasis_cache.c
   9548128 bytes in      35482 allocations in file stasis_bridges.c
   3923172 bytes in      92169 allocations in file res_rtp_asterisk.c
   3099771 bytes in      29185 allocations in file strings.c

Next minute it was already like this:
# cat memory_show_summary_201711101145.log | sort -nr | head
3091009032 bytes allocated (375561 in caches) in 16345842 allocations
1482476997 bytes in    1663899 allocations in file stasis_channels.c
1228978956 bytes in   10329516 allocations in file json.c
 242483220 bytes in    1783070 allocations in file stasis_message.c
  57063120 bytes in    1782910 allocations in file taskprocessor.c
  26354360 bytes in     205090 allocations in file rtp_engine.c
  17481272 bytes in     310935 allocations in file stasis_cache.c
   9628576 bytes in      35782 allocations in file stasis_bridges.c
   3828232 bytes in      93040 allocations in file res_rtp_asterisk.c
   3125118 bytes in      29427 allocations in file strings.c

So as we can see, total allocations increased 159965 or 30MB in one
A regular "memory show summary" during the day would be like this:
# cat memory_show_summary_201711140400.log | sort -nr | head
30595537 bytes allocated (406915 in caches) in 77906 allocations
11021007 bytes in       5081 allocations in file stasis_channels.c
 3825007 bytes in      12643 allocations in file cdr.c
 1855344 bytes in      15837 allocations in file pbx.c
 1766744 bytes in        183 allocations in file translate.c
 1467681 bytes in       1451 allocations in file chan_sip.c
 1400000 bytes in        280 allocations in file res_rtp_asterisk.c
  980474 bytes in       4740 allocations in file xmldoc.c
  895369 bytes in        167 allocations in file file.c
  800900 bytes in       2397 allocations in file channel.c
  280728 bytes in        613 allocations in file rtp_engine.c
  243147 bytes in       2556 allocations in file stasis.c
  130066 bytes in       1082 allocations in file stasis_cache.c
   24928 bytes in         82 allocations in file stasis_bridges.c
   51796 bytes in        486 allocations in file stasis_message.c

We also tried to get a "memory show allocations" when the problem was
happening. Unsurprisingly it halted Asterisk and after we cancelled it, it
had generated a 1GB file. Both memory logs hinted us to some problem in the
RTP module, specifically in the RTCP handling. After investigating the
code, looks like for some reason, the res_rtp_asterisk.c (line 4116) when
doing the ast_rtcp_read() ended up entering in a loop, causing lots of
allocations in the rtp_engine.c (line 2047 - ast_rtp_publish_rtcp_message).
Also apparently, another loop (or maybe the same, not sure...) happened
calling ast_channel_publish_blob() because we got lots of
create_channel_blob_message and ast_channel_snapshot_create allocations too.
Json, taskprocessor, stasis_message and stasis_cache allocations seems to
be side effect of so many stasis messages being created and published.
We could check memory logs 3 times for this problem and they are similar so
it seems that a bug exists somewhere leading to this problem.
So anyone has any idea of what could be happening or if it may be related
to some known bug?
We are running Asterisk 13.6.0 in CentOS 6.6.

Thanks very much!
Best regards,
Patrick Wakano
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-users/attachments/20171114/84018d5c/attachment.html>

More information about the asterisk-users mailing list