[asterisk-dev] [Code Review]: Read on disabled timerfd results in hangs, Take 2

kobaz reviewboard at asterisk.org
Mon Aug 22 11:45:57 CDT 2011



> On Aug. 12, 2011, 9:44 a.m., kobaz wrote:
> > Would any of the take 2 changes be a fix for ASTERISK-18250
> 
> Terry Wilson wrote:
>     Yes. The pervious patch leaked a reference to the timer, so it would never be deleted.

I found a very very consistent reproduction of the problem.

Dual span sangoma t1 card,  set up master/slave  net/cpe  crossover cable

chan_dahdi.conf
------------------
[trunkgroups]

[channels]
context=default
usecallerid=yes
hidecallerid=no
callwaiting=yes
usecallingpres=yes
callwaitingcallerid=yes
threewaycalling=yes
transfer=yes
canpark=yes
cancallforward=yes
callreturn=yes
;echocancel=yes
;echocancelwhenbridged=yes
relaxdtmf=yes
rxgain=0.0
txgain=0.0
group=1
callgroup=1
pickupgroup=1
immediate=no

;Sangoma A102 port 1 [slot:7 bus:3 span:1] <wanpipe1>
switchtype=national
context=from-pstn
group=0
echocancel=no
signalling=pri_net
channel =>1-23

;Sangoma A102 port 2 [slot:7 bus:3 span:2] <wanpipe2>
switchtype=national
context=from-pstn
group=0
echocancel=no
signalling=pri_cpe
channel =>25-47
------------------

wanpipe1
-----------

FE_MEDIA        = T1
FE_LCODE        = B8ZS
FE_FRAME        = ESF
FE_LINE         = 1
TE_CLOCK        = MASTER
TE_REF_CLOCK    = 0
--------

wanpipe2
-----------

FE_MEDIA        = T1
FE_LCODE        = B8ZS
FE_FRAME        = ESF
FE_LINE         = 2
TE_CLOCK        = NORMAL
TE_REF_CLOCK    = 2
-------------


[Aug 22 12:29:28] VERBOSE[8409] pbx.c:     -- <SIP/201-00000001> Executing [81234 at _cos_internal+local+ld+intl:1] Dial(DAHDI/g0/1234)
[Aug 22 12:29:28] VERBOSE[8409] app_dial.c:     -- <SIP/201-00000001> calling DAHDI/i1/1234-2 (callee proceeding, passing it to caller)
[Aug 22 12:29:28] VERBOSE[8409] app_dial.c:     -- <SIP/201-00000001> calling DAHDI/i1/1234-2 (callee answered)
[Aug 22 12:29:28] WARNING[8410] res_timing_timerfd.c: Reading attempt on idle timerfd. This would have caused a deadlock.
[Aug 22 12:29:28] WARNING[8410] res_timing_timerfd.c: Reading attempt on idle timerfd. This would have caused a deadlock.
[Aug 22 12:29:28] WARNING[8410] res_timing_timerfd.c: Reading attempt on idle timerfd. This would have caused a deadlock.
[Aug 22 12:29:28] WARNING[8410] res_timing_timerfd.c: Reading attempt on idle timerfd. This would have caused a deadlock.
[Aug 22 12:29:28] WARNING[8410] res_timing_timerfd.c: Reading attempt on idle timerfd. This would have caused a deadlock.
....repeating


- kobaz


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/1361/#review4048
-----------------------------------------------------------


On Aug. 12, 2011, 9:41 a.m., Terry Wilson wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/1361/
> -----------------------------------------------------------
> 
> (Updated Aug. 12, 2011, 9:41 a.m.)
> 
> 
> Review request for Asterisk Developers, David Vossel, kobaz, irroot, and jrose.
> 
> 
> Summary
> -------
> 
> This was originally reviewed as https://reviewboard.asterisk.org/r/1255 and then committed and reverted when a performance issue was found.
> 
> There were three problems with this patch. 1) There was no locking done around the timer, 2) We never released the reference to our_timer, and 3) The saved_timer values do not necessarily equal the actual timer value--it exists solely for setting things back the way they were after enabling/disabling continuous mode. We should not be using it to decide whether or not to do a read().
> 
> This patch adds locking, releases the reference from the ao2_find(), and removes the troublesome if() block, and uses ast_debug instead of ast_log(LOG_DEBUG, ...). In all other ways it is identical to the patch that was originally committed.
> 
> 
> Diffs
> -----
> 
>   /branches/1.8/res/res_timing_timerfd.c 331573 
> 
> Diff: https://reviewboard.asterisk.org/r/1361/diff
> 
> 
> Testing
> -------
> 
> None, yet. The other patch worked, but caused a regression with analog phones. jrose, if you still have a setup available to test this patch to see if the regression is gone, the patch is very much like the original so it should work. If anyone has a good example of how to reliably reproduce the original issue I would be happy to test that as well.
> 
> 
> Thanks,
> 
> Terry
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-dev/attachments/20110822/5bc4b7b6/attachment.htm>


More information about the asterisk-dev mailing list