[asterisk-bugs] [JIRA] (ASTERISK-29232) Memory Leak since 16.13.0

Luke Escude (JIRA) noreply at issues.asterisk.org
Mon Jan 11 10:14:16 CST 2021


    [ https://issues.asterisk.org/jira/browse/ASTERISK-29232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=253352#comment-253352 ] 

Luke Escude commented on ASTERISK-29232:
----------------------------------------

1. Off the top of my head:
 - Calls come in and are immediately Answer()'d. 
 - CallerID Number is run through a local PHP script via SHELL() to be sanitized, all symbols and non-digits stripped, with the number capped at 20 digits in length.
 - CallerID Name is fetched via a local PHP script via SHELL() and is also returned in a sanitized format
 - Number is run through a Blacklist check script (also PHP) via SHELL(), returns a 1 if the number is blocked by the global blacklist.
 - Number is run through a spam score check script (also PHP) via SHELL(), returns a spam probability score 0-100, and follows rules the user sets up.
 - Fax detection is enabled by the user's choice, and is sent to the fax destination, otherwise Goto() the standard destination. IVRs, ring groups, etc.

Those are the steps that ALL inbound calls go through. Next, as far as functionality is concerned:

1. PHP scripts are used throughout dial plan to get various states from the database, like queue agent login/logout. call forwarding settings, etc. 
2. We are not using any ARI stuff whatsoever. 
3. We are using a lot of AMI stuff - One-directional AMI. Basically every time extensions register, or calls are processed, we receive those AMI events and stick them into the database so users can monitor their phone systems in real-time. We don't write to the AMI socket much, except maybe for a click-to-call command here and there.
4. Voicemail is configured using the conf file, but uses ODBC. The voicemail database node is always less than 5ms away from wherever the Asterisk instance might be running. I intend to replace this with Amazon S3 in an upcoming update.
5. CDR is also stored via ODBC. This will also be replaced with in-dialplan PHP scripts instead of using Asterisk's built-in CDR engine in a future release.
6. Essentially everything is configured via the Conf files - No "realtime" stuff.

2. It's possible to give someone access, but it would have to be on-demand - Since these Asterisk instances can hop around different datacenters, there's no way for us to know where it might be in a week. (Example: It may be running in our Miami datacenter right now, but if Miami goes down for maintenance or whatever, it could end up Chicago, or NYC, or Dallas, etc.)

I am going to keep trying to track it down. The instance in question is currently hovering around 796MB of RAM, but if it doesn't increase over the next 2 days, then there may not be an issue.

> Memory Leak since 16.13.0
> -------------------------
>
>                 Key: ASTERISK-29232
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-29232
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Core/PBX
>    Affects Versions: 16.15.0
>         Environment: CentOS 7 x64
>            Reporter: Luke Escude
>            Assignee: Luke Escude
>              Labels: fax
>         Attachments: Apex-Analysis.xlsx, cw1-memchart.png, Jan6-1401.csv, nw1-memchart.png, PW3-Memchart.png
>
>
> So we have around 100 instances of Asterisk 16.13.0 that have been running for over 2 months, normal load (small businesses with less than 30 users each), without issue.
> We have another 350 instances of Asterisk 16.15.0 that we've started seeing a very linear increase in memory consumption over time. Specifically, we see higher-load instances (150+ users) last only a few days before hitting our artificial 3GB ceiling and getting restarted by the OOM killer.
> There are very few differences in our implementation of the 16.13 and 16.15 versions. All versions are set up as the following:
> - CentOS 7 64-bit
> - Voicemail over ODBC
> - unixODBC 2.3.1
> - MariaDB Connector (instead of the crappy mysql connector)
> - CDR over MySQL
> - SIP Trunks are registered every 2 minutes, qualified every 15 seconds.
> - User devices register every 10 minutes, qualified every 15 seconds.
> - User devices connect via TCP more often than UDP.
> - I have NO pjsip threadpool configuration options defined. I think the default is 50 threads?
> Here is what I am about to test within the next week:
> 1. unixODBC updated to 2.3.9
> 2. Longer SIP Trunk Registration period - Maybe PJSIP is working too hard?
> 3. Longer qualify timeout - Maybe PJSIP is working too hard?
> One of my first questions: Is it SAFE to compile asterisk with MALLOC_DEBUG and just leave it on permanently? I am scared to enable it, and suddenly have a bunch of users that are experiencing issues because I've enabled something that should only be enabled in Dev.
> Sorry for the length of the post, trying to cover as much ground as possible.



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list