[asterisk-bugs] [JIRA] Commented: (ASTERISK-20375) Asterisk channel reference leak when attempting to transfer a call originated to a local channel running the Echo application

Mark Michelson (JIRA) noreply at issues.asterisk.org
Tue Sep 11 18:01:07 CDT 2012


    [ https://issues.asterisk.org/jira/browse/ASTERISK-20375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=196960#comment-196960 ] 

Mark Michelson commented on ASTERISK-20375:
-------------------------------------------

I decided to look into the transferdigittimeout to see if that might be getting screwed up on my system. I applied the following diff on my system:

{noformat}
root at levin:/usr/src/asterisk/tags/1.8.16.0-rc1# svn diff
Index: main/channel.c
===================================================================
--- main/channel.c    (revision 372336)
+++ main/channel.c    (working copy)
@@ -3544,6 +3544,7 @@
 
 int ast_waitfordigit_full(struct ast_channel *c, int ms, int audiofd, int cmdfd)
 {
+    int saved_ms = ms;
     /* Stop if we're a zombie or need a soft hangup */
     if (ast_test_flag(c, AST_FLAG_ZOMBIE) || ast_check_hangup(c))
         return -1;
@@ -3557,6 +3558,10 @@
         struct ast_channel *rchan;
         int outfd=-1;
 
+        if (saved_ms != ms) {
+            ast_log(LOG_NOTICE, "In waitfordigitfull, ms is %d\n", ms);
+            saved_ms = ms;
+        }
         errno = 0;
         rchan = ast_waitfor_nandfds(&c, 1, &cmdfd, (cmdfd > -1) ? 1 : 0, NULL, &outfd, &ms);
        
Index: main/features.c
===================================================================
--- main/features.c    (revision 372336)
+++ main/features.c    (working copy)
@@ -2509,7 +2509,9 @@
     }
 
     /* this is specific of atxfer */
+    ast_log(LOG_NOTICE, "ast_app_dtget with transferdigittimeout of %d\n", transferdigittimeout);
     res = ast_app_dtget(transferer, transferer_real_context, xferto, sizeof(xferto), 100, transferdigittimeout);
+    ast_log(LOG_NOTICE, "ast_app_dtget returned with %d\n", res);
     if (res < 0) {  /* hangup or error, (would be 0 for invalid and 1 for valid) */
         finishup(transferee);
         return -1;
{noformat}
And here is the resulting output that I see on the CLI when I run my test:
{noformat}
*CLI> originate SIP/2000 extension 2301 at default
*CLI> [Sep 11 17:25:44] NOTICE[12863]: features.c:2514 builtin_atxfer: ast_app_dtget with transferdigittimeout of 3000
[Sep 11 17:25:44] NOTICE[12863]: channel.c:3562 ast_waitfordigit_full: In waitfordigitfull, ms is 2980
[Sep 11 17:26:08] NOTICE[12863]: channel.c:3562 ast_waitfordigit_full: In waitfordigitfull, ms is 2978
[Sep 11 17:27:08] NOTICE[12863]: channel.c:3562 ast_waitfordigit_full: In waitfordigitfull, ms is 2976
[Sep 11 17:35:09] NOTICE[12863]: channel.c:3562 ast_waitfordigit_full: In waitfordigitfull, ms is 2974
[Sep 11 17:37:34] NOTICE[12863]: channel.c:3562 ast_waitfordigit_full: In waitfordigitfull, ms is 2973
[Sep 11 17:37:38] NOTICE[12863]: channel.c:3562 ast_waitfordigit_full: In waitfordigitfull, ms is 2972
[Sep 11 17:37:38] NOTICE[12863]: channel.c:3562 ast_waitfordigit_full: In waitfordigitfull, ms is 2970
[Sep 11 17:37:41] NOTICE[12863]: channel.c:3562 ast_waitfordigit_full: In waitfordigitfull, ms is 2967
[Sep 11 17:37:41] NOTICE[12863]: channel.c:3562 ast_waitfordigit_full: In waitfordigitfull, ms is 2966
[Sep 11 17:41:09] NOTICE[12863]: channel.c:3562 ast_waitfordigit_full: In waitfordigitfull, ms is 2964
{noformat}
As you can see, the proper timeout is being passed into {{ast_app_dtget()}}, but ... stuff is clearly broken here. For whatever reason, after about 15 minutes, the number of milliseconds has only decreased by 35. Furthermore, it's difficult to see a definite pattern to when the number decreases. There are two other interesting data points to bring up.

1) The change to main/channel.c initially did not have the {{saved_ms}} variable. Instead, I printed the {{ms}} value on every iteration of the while loop. This resulted in a huge flood of messages, most of which did not show any sort of change to {{ms}}. I added the saved_ms variable so that I could see when the value of ms actually changed. So for whatever reason, the underlying {{poll()}} call is constantly waking up but the {{ms}} value is not decreasing as it should.

2) If I attach to the running Asterisk process with gdb, and then exit out of gdb, this results in {{ast_app_dtget()}} returning immediately. I suspect this is because the gdb session stopped the Asterisk process for longer than three seconds. When gdb exited, the SIGCONT (or whatever signal is used to get the process going again) woke up the poll(), which properly dropped {{ms}} down by the time that it was supposed to.

I'm starting to suspect that the problem could be due to whatever high-resolution timer is being used on my system. {{poll()}} returns after a sub-millisecond time frame, thus resulting in the {{ms}} value not decreasing as necessary. This problem could be due to how {{ast_waitfor_nandfds()}} handles its timeouts, or it could be a fault of the {{poll()}} system call itself.

I wonder if perhaps the timing module in use might be playing a role in this. I'm using the {{timerfd}} module myself. If you run {{timing test}} from the CLI, which module does it say is in use?

For reference, my operating environment is an Ubuntu 10.04 system running 2.6.32-38-generic kernel. gcc is version 4.4.3. No idea if that makes a difference here or not.

> Asterisk channel reference leak when attempting to transfer a call originated to a local channel running the Echo application
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: ASTERISK-20375
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-20375
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Channels/chan_local, Channels/chan_sip/Interoperability
>    Affects Versions: 1.8.16.0
>            Reporter: Mark Michelson
>
> While working on a separate issue, I managed to reliably create a situation that results in a channel reference leak.
> Use the following extensions.conf:
> {noformat}
> [default]
> exten => 11,1,Answer()
> same => n,Echo()
> exten => 2301,1,Dial(Local/11 at default,,t)
> {noformat}
> 1) From the CLI issue the following command:
> {noformat}
> originate SIP/999 extension 2301 at default
> {noformat}
> where SIP/999 is a SIP phone.
> 2) Pick up SIP/999 when it starts ringing.
> 3) Press the configured attended transfer DTMF in features.conf. You will start to hear music on hold (odd).
> 4) Hang up SIP/999.
> 5) Issue a {{core show channels}} CLI command.
> You will notice that SIP/999 is still around, as are the two Local/2301 channels. Every so often, a warning message will appear
> {noformat}
> *CLI> [Sep  6 16:37:27] WARNING[610]: chan_sip.c:3918 __sip_autodestruct: Autodestruct on dialog '54200e235d36cebb6182df3d0b9ddf5f at 10.24.20.249:5060' with owner in place (Method: BYE). Rescheduling destruction for 10000 ms
> {noformat}
> Yay, you have leaking channel refs!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



More information about the asterisk-bugs mailing list