[asterisk-bugs] [JIRA] (ASTERISK-21378) chan_sip completely blocks on DNS lookups

Jaco Kroon (JIRA) noreply at issues.asterisk.org
Mon May 20 09:20:02 CDT 2013


    [ https://issues.asterisk.org/jira/browse/ASTERISK-21378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=206624#comment-206624 ] 

Jaco Kroon commented on ASTERISK-21378:
---------------------------------------

Olle is dealing with SRV records.  His work will in a manner depends on this being fixed.  Or will need to be fixed separately.
                
> chan_sip completely blocks on DNS lookups
> -----------------------------------------
>
>                 Key: ASTERISK-21378
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-21378
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Channels/chan_sip/General
>    Affects Versions: 11.3.0
>         Environment: Gentoo Linux, asterisk 11.3.0
>            Reporter: Jaco Kroon
>            Assignee: Matt Jordan
>            Severity: Critical
>
> One of the bigger ISPs in South Africa decided to blow up their entire network today.  Our setup has quite a number (16 to be exact) of register lines of the form:
> {noformat}
> register => 2787....:secret at sip.iburst.co.za/087....
> {noformat}
> As soon as they decided to press the big red button to take down their network and we hit a SIP reload ... *boom* - we could for the live of us not get asterisk back up into a working state.  We have a sip peer looking like this:
> {noformat}
> [iburst]
> host = sip.iburst.co.za
> type=friend
> qualify=yes
> disallow=all
> allow=g729
> context=inbound-iburst
> directmedia=no
> dtmfmode=rfc2833
> accountcode=iBurst
> jbforce=no
> {noformat}
> Knowing that iBurst went down, and spotting this log entry brought up the theory:
> {noformat}
> [Apr  3 19:36:12] ERROR[27636] netsock2.c: getaddrinfo("sip.iburst.co.za", "(null)", ...): Name or service not known
> {noformat}
> so, commented out the register lines, and behove and behold, it takes about 20 seconds longer than usual for asterisk to start servicing the :5060 udp socket (normally a watch netstat -nulp won't ever show the Recv-Q being anything other than 0, currently it'll keep climbing for around 20 seconds before dropping back down to zero).
> With the register lines uncommented you can forget about sane operation.  It will not happen.  In fact, the only way for me to recover is to kill -9 asterisk.
> I currently have dnsmgr disabled, even though I can see (from the code) that the handling differs with dnsmgr enabled, and it does make more sense for me to have it enabled anyway.
> I'm not sure what the best way would be to handle this, but I suspect that registrations needs to happen in a separate thread, DNS lookups should probably happen without any locks held in chan_sip.
> For the moment (since none of peers I need to peer with use SRV records, and their DNS should not change that often) I might be better off to perform the DNS lookups outside of asterisk and just hard-code the IPs into the config.  From a rudementary test this seems to work quite well (asterisk is back to normal behaviour of starting up chan_sip in a VERY short time frame).
> A quick test with dnsmgr enabled, but utilizing DNS names again instead of IP addresses results in completely broken behaviour again.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.asterisk.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



More information about the asterisk-bugs mailing list