[asterisk-dev] [Code Review] Return errors from ast_timer_ack and add some defensive coding surrounding the use of timers

Matt Jordan reviewboard at asterisk.org
Thu Nov 1 17:18:37 CDT 2012


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/2178/
-----------------------------------------------------------

(Updated Nov. 1, 2012, 5:18 p.m.)


Review request for Asterisk Developers, Mark Michelson and rmudgett.


Changes
-------

Addressed Mark's findings.


Summary
-------

This is a slightly cleaned up version of Jeremiah Gowdy's patch on ASTERISK-20032.

I'll quote Jeremiah from the issue, as it explains the motivation behind the patch:

"[Jun 21 12:39:18] ERROR[30608] res_timing_timerfd.c: Call to timerfd_gettime() error: Bad file descriptor
[Jun 21 12:39:18] ERROR[30608] res_timing_timerfd.c: Call to timerfd_gettime() error: Bad file descriptor
[Jun 21 12:39:18] ERROR[30608] res_timing_timerfd.c: Call to timerfd_gettime() error: Bad file descriptor
[Jun 21 12:39:18] ERROR[30608] res_timing_timerfd.c: Call to timerfd_gettime() error: Bad file descriptor

The log only seems to contain a single thread that is stuck in this state.  Other calls on other channels seem to continue to function.  I am working to provide better context surrounding when this error occurs. 

---

I've changed the timer interface to make the timer ack function(s) in all of the timer implementations return 0 on success / -1 on failure, and I've changed all the places that call ast_timer_ack check that return code and return error themselves if it fails.  This stops the streaming of the timer errors and allowed me to determine which call is getting the error:

[Jun 22 10:46:09] VERBOSE[20352] pbx.c:     -- Executing [s at originate:1] Answer("Local/s at originate-3722;2", "") in new stack
[Jun 22 10:46:09] ERROR[20352] res_timing_timerfd.c: Call to timerfd_gettime() using handle 257 error: Bad file descriptor
[Jun 22 10:46:09] ERROR[20352] channel.c: Timer failed in ast_read
[Jun 22 10:46:09] VERBOSE[20352] pbx.c:   == Spawn extension (originate, s, 1) exited non-zero on 'Local/s at originate-3722;2'

It seems that the issue happens when channel.c calls ast_timer_ack from ast_read."

This patch doesn't resolve the failure he's seeing where the timer fails in ast_read; however, by propagating the timer errors up from the timing layer to the timer consumers - and by making them bail when they see an error - Asterisk stops spamming/freaking out on the error condition.  It doesn't, of course, answer why the Local channel had a bad timer file descriptor in this particular code path - but this seems like a good first step to making the error condition manageable.

Note that a good portion of this patch will apply to 1.8; however, since some of it won't (func_jitterbuffer) and since Jeremiah initially wrote it against 10, I kept it against that branch.

 


This addresses bug ASTERISK-20032.
    https://issues.asterisk.org/jira/browse/ASTERISK-20032


Diffs (updated)
-----

  /branches/10/res/res_timing_timerfd.c 375450 
  /branches/10/res/res_timing_pthread.c 375450 
  /branches/10/res/res_timing_kqueue.c 375450 
  /branches/10/res/res_timing_dahdi.c 375450 
  /branches/10/res/res_musiconhold.c 375450 
  /branches/10/res/res_fax_spandsp.c 375450 
  /branches/10/main/timing.c 375450 
  /branches/10/funcs/func_jitterbuffer.c 375450 
  /branches/10/include/asterisk/timing.h 375450 
  /branches/10/main/channel.c 375450 
  /branches/10/bridges/bridge_softmix.c 375450 
  /branches/10/channels/chan_iax2.c 375450 

Diff: https://reviewboard.asterisk.org/r/2178/diff


Testing
-------


Thanks,

Matt

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20121101/7710c011/attachment.htm>


More information about the asterisk-dev mailing list