[asterisk-bugs] [JIRA] (ASTERISK-24837) chan_sip calls to Asterisk result in file descriptors growing exponentially while channels remain up

Tue Nov 10 18:08:33 CST 2015

     [ https://issues.asterisk.org/jira/browse/ASTERISK-24837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rusty Newton updated ASTERISK-24837:
------------------------------------

    Assignee: Private Name  (was: Rusty Newton)
      Status: Waiting for Feedback  (was: Triage)

I've had some time to attempt reproduction of your issue. I'm unable to reproduce it by following your instructions (though the instructions are not very clear).

{quote}
I can set up a test bed, it has to be two machines back to back, and use a script to start the channels. I already did provide the script, however, I am uploading it again.
If you agree, then contact me for the credentials and you can see it.
As I said before, there is no dialplan. Just a simple script that generates calls and when they connect, calls Echo().
Once the channels are up, check the issue with lsof.
I cannot imagine a simpler description than this, but maybe I am wrong.
In case yo want to try again, please set up two machines, one sends and one receives calls and plays music on hold, two lines of code.
{quote}

I setup two machines, 

Asterisk A : Asterisk 13.6.0
Asterisk B : Asterisk 11.20.0

On A I used your script to generate calls with the PJSIP channel driver.
B receives the calls and uses the sip.conf you provided with only a change to the host address for the demo peer.

I tested in two cases.

Case one: Calls received by B are answered and put into music on hold using dialplan as you described.
{noformat}
[reject]
exten = s,1,Answer()
exten = s,2,Musiconhold()
{noformat}
Case two: B has empty extensions.conf , Calls received by B are rejected by asterisk as the context 'reject' does not exist.

In both cases, with hundreds of calls up and established, and thousands of calls processed, the file handles and UDP ports allocated both appear sane and *do not* increase exponentially. As the calls hangup the UDP ports and file handles decrease as expected.

I viewed them using the same commands you use to be sure we are measuring the same way.

Since no one else is reporting these issues and we cannot reproduce the issues following your instructions the burden is on you to produce a step by step guide demonstrating how to reproduce the issue from scratch. There is likely a missing element in your environment or configuration that is required for the problem to occur. It'll be up to you to figure this out and demonstrate what that element is.

I'll leave these issues in Waiting on Feedback for a couple more weeks to give you a chance to hunt down this missing element and communicate to us what it is.

I'll also repeat that we do not provide tech support via this issue tracker. We will not be logging onto your machine to look at the issue. We don't have the time or resources to do that. If you need help locating the issue I encourage you to hire a developer or consultant to assist you in tracking down the crucial details needed to reproduce the problem and demonstrate the bug in a way where we can go fix it.

Please discuss the file handle leak issue *only* on this issue ASTERISK-24837.

Please discuss the UDP port leak issue *only* on ASTERISK-25460.

> chan_sip calls to Asterisk result in file descriptors growing exponentially while channels remain up
> ----------------------------------------------------------------------------------------------------
>
>                 Key: ASTERISK-24837
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-24837
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Resources/res_rtp_asterisk
>    Affects Versions: 13.2.0
>         Environment: Linux 64
>            Reporter: Private Name
>            Assignee: Private Name
>         Attachments: asterisktest.sh, core_show_fd_120_channels.txt, sip.conf, trace.txt, trace.txt, trace.txt, valgrind.txt, valgrind.txt, valgrind.txt, valgrind.txt
>
>
> [Edit by Rusty - Please use the wiki formatting available to make reports easy to read - I'm going to clean this up for now.]
> When I originate several hundred calls using a call file, no dialplan, using app Echo() or app MusicOnHold, the calls connect, but when the other side starts to send media, after some 200+ calls I get several errors:
> {noformat}
> "ERROR[2433] res_rtp_asterisk.c: RTCP SR transmission error to 208.78.162.174:34443, rtcp halted Operation not permitted"
> {noformat}
> and the handle count explodes, measured at the end
> {noformat}
> lsof | grep asterisk | grep FIFO | wc -l
> 1025046
> {noformat}
> yes one million and change. The handle count never decreases as long as the channels are open. There are no active calls, only 500 channels.
> The call files are all identical:
> {noformat}
> Channel: SIP/0000000000 at demo
> CallerID: "0000000000" <>
> WaitTime: 45
> MaxRetries: 0
> RetryTime: 0
> Application: Echo
> Data:
> Archive: no
> {noformat}
> where demo is a simple peer like this
> {noformat}
> [demo]
> host=xxx.xxx.xxx.xx 
> type=peer
> insecure=port,invite
> context=reject
> disallow=all
> allow=ulaw
> allow=g729
> session-timers=accept
> port=5060
> faxdetect=no
> transport=udp
> directmedia=yes
> {noformat}
> *Note 1:* the caller ID may vary call by call  it makes no difference. The issue here is the very high handle count, which slows down and kills the machine, and the errors which show that many calls do not connect.
> *Note 2:* The calls do no across the internet, they go to a local box.
> I need to send as many calls in real life, with media, so this is a killer for my business model.
> If I set up debug=10 and verbose=20, I get thousands of lines identical like this ones
> {noformat}
> Feb 26 23:47:13] DEBUG[2433]: acl.c:963 ast_find_ourip: Attached to given IP address
> [Feb 26 23:47:13] DEBUG[5763]: res_rtp_asterisk.c:3958 ast_rtcp_read: Got RTCP report of 64 bytes
> [Feb 26 23:47:13] DEBUG[5763]: acl.c:958 ast_find_ourip: Not an IPv4 nor IPv6 address, cannot get port.
> [Feb 26 23:47:13] DEBUG[5763]: netsock2.c:172 ast_sockaddr_split_hostport: Splitting 'dasaro' into...
> [Feb 26 23:47:13] DEBUG[5763]: netsock2.c:226 ast_sockaddr_split_hostport: ...host 'dasaro' and port ''.
> [Feb 26 23:47:13] DEBUG[5763]: acl.c:958 ast_find_ourip: Not an IPv4 nor IPv6 address, cannot get port.
> [Feb 26 23:47:13] DEBUG[5763]: acl.c:963 ast_find_ourip: Attached to given IP address
> {noformat}
> {noformat}
>  ulimit -a
> core file size          (blocks, -c) unlimited
> data seg size           (kbytes, -d) unlimited
> scheduling priority             (-e) 0
> file size               (blocks, -f) unlimited
> pending signals                 (-i) 1048576
> max locked memory       (kbytes, -l) unlimited
> max memory size         (kbytes, -m) unlimited
> open files                      (-n) 1048576
> pipe size            (512 bytes, -p) 8
> POSIX message queues     (bytes, -q) 819200
> real-time priority              (-r) 0
> stack size              (kbytes, -s) 8192
> cpu time               (seconds, -t) unlimited
> max user processes              (-u) unlimited
> virtual memory          (kbytes, -v) unlimited
> file locks                      (-x) unlimited
> {noformat}
> *Note 3:* Please note how the handles start to grow with each additional call
> {noformat}
> Count idle
> 46
> 1 call
> lsof | grep asterisk | grep FIFO | wc -l
> 104
> 2 calls
> lsof | grep asterisk | grep FIFO | wc -l
> 168
> 3 calls
> lsof | grep asterisk | grep FIFO | wc -l
> 240
> 4 calls
> lsof | grep asterisk | grep FIFO | wc -l
> 320
> {noformat}
> As you can see, the handles do not grow linearly, but close to exponentially.

--
This message was sent by Atlassian JIRA
(v6.2#6252)