[asterisk-dev] [Code Review]: Fix timeout antipattern throughout the code

rmudgett reviewboard at asterisk.org
Thu Oct 11 12:21:33 CDT 2012



> On Oct. 3, 2012, 2:36 p.m., rmudgett wrote:
> > /branches/1.8/apps/app_waitforring.c, lines 106-113
> > <https://reviewboard.asterisk.org/r/2135/diff/2/?file=31683#file31683line106>
> >
> >     Revert this change.  It was not broken and now is broken.
> >
> 
> Mark Michelson wrote:
>     Why is it broken?
>     
>     The only change here is that I've changed a waitfor that lasted for ~28 hours to an indefinite one. It seems quite clear that the intent was to create an unlimited waitfor here but the writer was unaware of how to do it.

It was broken because ast_waitfor() with a -1 timeout could not report an error condition correctly.  It would return a negative value because the timeout was negative.  Reverting ast_waitfor() removes that issue.


> On Oct. 3, 2012, 2:36 p.m., rmudgett wrote:
> > /branches/1.8/channels/chan_dahdi.c, line 6059
> > <https://reviewboard.asterisk.org/r/2135/diff/2/?file=31685#file31685line6059>
> >
> >     Revert this.
> 
> Mark Michelson wrote:
>     Why? It's within an if that already checks that res is less than 0.

Just passing up the negative value returned by ast_waitfor() instead of setting res = -1 here could break the return conditions of this function.  What if a change to ast_waitfor() returns -2 instead?  This function would now return -2 when it would have returned -1 before.  Callers of this function may now be broken because of it.  In fact, this function itself will break.  Right after the while loop it checks if res == -1.


> On Oct. 3, 2012, 2:36 p.m., rmudgett wrote:
> > /branches/1.8/channels/chan_dahdi.c, line 6065
> > <https://reviewboard.asterisk.org/r/2135/diff/2/?file=31685#file31685line6065>
> >
> >     Revert this.
> 
> Mark Michelson wrote:
>     I can't revert this without making other changes. res is 0 prior to entering the loop. Consider that upon entering the loop, ast_waitfor() returns a positive value. A frame is read in ast_read() and it's not a hangup frame. The loop will then get broken out of with res set to some positive number.
>     
>     Prior to my changes, this set of conditions would break out of the loop with res set to zero.
>     
>     The idea here is to use res for the ast_waitfor() and then to revert it back to its pre-while loop state afterward. The alternative would be to use a different variable for testing ast_waitfor() instead of res.

OK.  That makes sense and I think fixes a bug because a nonzero value returned by this function is going to hangup the call.  This function is a dialplan application.


- rmudgett


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/2135/#review7207
-----------------------------------------------------------


On Oct. 8, 2012, 5:40 p.m., Mark Michelson wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/2135/
> -----------------------------------------------------------
> 
> (Updated Oct. 8, 2012, 5:40 p.m.)
> 
> 
> Review request for Asterisk Developers.
> 
> 
> Summary
> -------
> 
> This is an interesting one. I filed issue ASTERISK-20375 a while back and David Lee fixed it. The problem was that in my setup, a call to ast_waitfor_nandfds() returned in less than a millisecond. The result was that the input-output value passed into ast_waitfor_nandfds() would never decrement the value properly. Thus a 3000 ms timeout would take *days* to actually time out.
> 
> David discovered that a similar pattern for determining the passage of time had been used throughout the code. This patch is an attempt to fix all of those. I've created a function in time.h called ast_remaining_ms() that will tell the amount of time remaining given a starting timestamp and the duration that the original timeout was. This is useful for determining the timeout to pass to an ast_waitfor() or related function.
> 
> Most of the changes here were straightforward, but there were some tricky bits. Focus strongly on the following changes:
> 
> 1) wait_for_answer() in app_dial.c
> 2) wait_for_answer() in app_queue.c
> 3) generic_fax_exec() in res_fax.c
> 4) places where I have started setting "res" instead of "ms" or something similar in ast_waitfor() calls. Ensure that I have not tainted a return value. I found a few of these places while testing and fixed what I found on my own, but I could have missed some places.
> 5) places where a timeout is supposed to restart. I've tried to find these places, but I may have missed some.
> 
> In addition, there has been a change made to the return of ast_waitfor(). Prior to this patch, if a negative timeout were passed to ast_waitfor(), it would be impossible to return a negative value. This meant that it was impossible to detect if an error had occurred. I've changed ast_waitfor() to be able to return a negative value if a negative timeout is passed in, but only if the underlying call to ast_waitfor_nandfds() returns NULL.
> 
> I did this because I found many places in the code with this construct:
> 
> if (ast_waitfor(chan, -1) < 0) {
>     ast_log(LOG_ERROR, "ERMAHGERD!\n");
>     return HORRIBLE_ERROR_CONDITION;
> }
> 
> The problem is that the error return could never actually be reached. It should be possible with my change. Please let me know if my change could cause problems though.
> 
> 
> This addresses bug ASTERISK-20414.
>     https://issues.asterisk.org/jira/browse/ASTERISK-20414
> 
> 
> Diffs
> -----
> 
>   /branches/1.8/apps/app_dial.c 373847 
>   /branches/1.8/apps/app_jack.c 373847 
>   /branches/1.8/apps/app_meetme.c 373847 
>   /branches/1.8/apps/app_queue.c 373847 
>   /branches/1.8/apps/app_record.c 373847 
>   /branches/1.8/apps/app_waitforring.c 373847 
>   /branches/1.8/channels/chan_agent.c 373847 
>   /branches/1.8/channels/chan_dahdi.c 373847 
>   /branches/1.8/channels/chan_iax2.c 373847 
>   /branches/1.8/channels/sig_analog.c 373847 
>   /branches/1.8/channels/sig_pri.c 373847 
>   /branches/1.8/include/asterisk/time.h 373847 
>   /branches/1.8/main/channel.c 373847 
>   /branches/1.8/main/pbx.c 373847 
>   /branches/1.8/main/rtp_engine.c 373847 
>   /branches/1.8/main/utils.c 373847 
>   /branches/1.8/res/res_fax.c 373847 
> 
> Diff: https://reviewboard.asterisk.org/r/2135/diff
> 
> 
> Testing
> -------
> 
> I've placed many test calls through items that used timeouts like Dial and Queue with a timeout. I've also ensured that other apps that use ast_waitfor() such as Answer() and Echo() still function properly.
> 
> 
> Thanks,
> 
> Mark
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20121011/e5e30a48/attachment-0001.htm>


More information about the asterisk-dev mailing list