[asterisk-bugs] [JIRA] (ASTERISK-30381) Using unbound, queries do not try all available nameservers, and contacts will flap

Asterisk Team (JIRA) noreply at issues.asterisk.org
Wed Dec 28 20:57:06 CST 2022


    [ https://issues.asterisk.org/jira/browse/ASTERISK-30381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=261084#comment-261084 ] 

Asterisk Team commented on ASTERISK-30381:
------------------------------------------

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. Please note that log messages and other files should not be sent to the Sangoma Asterisk Team unless explicitly asked for. All files should be placed on this issue in a sanitized fashion as needed.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

Please note that by submitting data, code, or documentation to Sangoma through JIRA, you accept the Terms of Use present at [https://www.asterisk.org/terms-of-use/|https://www.asterisk.org/terms-of-use/].

> Using unbound, queries do not try all available nameservers, and contacts will flap
> -----------------------------------------------------------------------------------
>
>                 Key: ASTERISK-30381
>                 URL: https://issues.asterisk.org/jira/browse/ASTERISK-30381
>             Project: Asterisk
>          Issue Type: Bug
>      Security Level: None
>          Components: Resources/res_resolver_unbound
>    Affects Versions: 18.15.1, 19.7.1, 20.0.1
>            Reporter: Mark Murawski
>
> Using what's probably a fairly standard DNS server list containing a local DNS server and some backups, using the unbound DNS resolver  will result in non-deterministic lookup failures.
> Given resolv.conf:
> {{monospaced}}
> options attempts:3 timeout:1
> nameserver 192.168.5.2
> nameserver 4.2.2.2
> nameserver 8.8.8.8
> {{monospaced}}
> Given resolver_unbound.conf
> {{monospaced}}
> [general]
> hosts = /etc/hosts
> resolv = /etc/resolv.conf
> {{monospaced}}
> Given pjsip_wizard.conf
> {{monospaced}}
> [foo]
> type = wizard
> remote_hosts = foo.vpn.lan
> ---snip--- ... other settings here
> {{monospaced}}
> You wind up with contacts flapping in reachability due to DNS but not due to lack of SIP OPTIONS.  (The foo.vpn.lan host was responding to SIP OPTIONS this entire time, but we had intermittent DNS failures):
> {{monospaced}}
> Contact wombat/sip:foo.vpn.lan is now Reachable.  RTT: 37.946 msec
> Contact wombat/sip:foo.vpn.lan is now Unreachable.  RTT: 0.000 msec
> Contact wombat/sip:foo.vpn.lan is now Reachable.  RTT: 37.946 msec
> Contact wombat/sip:foo.vpn.lan is now Unreachable.  RTT: 0.000 msec
> Contact wombat/sip:foo.vpn.lan is now Reachable.  RTT: 37.946 msec
> Contact wombat/sip:foo.vpn.lan is now Unreachable.  RTT: 0.000 msec
> {{monospaced}}
> The reason for this is two fold:
> Unbound does not query more than one DNS server to get the result for a given request.
> Unbound does not respect the order of DNS servers in /etc/resolv.conf
> Unbound debug logging shows the dns server order:
> {{monospaced}}
> [pid 10346] write(2, "[1672280502] libunbound[8890:0] info: DelegationPoint<.>: 0 names (0 missing), 3 addrs (0 result, 3 avail) parentNS\n", 116) = 116
> [pid 10346] getpid()                    = 8890
> [pid 10346] write(2, "[1672280502] libunbound[8890:0] debug:    ip4 8.8.8.8 port 53 (len 16)\n", 71) = 71
> [pid 10346] getpid()                    = 8890
> [pid 10346] write(2, "[1672280502] libunbound[8890:0] debug:    ip4 4.2.2.2 port 53 (len 16)\n", 71) = 71
> [pid 10346] getpid()                    = 8890
> [pid 10346] write(2, "[1672280502] libunbound[8890:0] debug:    ip4 192.168.5.2 port 53 (len 16)\n", 75) = 75
> [pid 10346] getpid()                    = 8890
> [pid 10346] write(2, "[1672280502] libunbound[8890:0] debug: attempt to get extra 3 targets\n", 70) = 70
> {{monospaced}}
> Take this example:
> {{monospaced}}
> Timestamp 12:00:00: DNS Lookup foo.vpn.lan using 8.8.8.8 .. fails due to vpn.lan only exists on 192.168.5.2... local cached dns for endpoint contact is deleted, host marked unreachable
> Timestamp 12:01:00: DNS Lookup foo.vpn.lan using 4.2.2.2 .. fails due to vpn.lan only exists on 192.168.5.2... local cached dns for endpoint contact is deleted, host marked unreachable
> Timestamp 12:02:00: DNS Lookup foo.vpn.lan using 192.168.5.2 .. success! endpoint dns is stored, host is marked reachable
> Timestamp 12:03:00: DNS Lookup foo.vpn.lan using 4.2.2.2 .. fails due to vpn.lan only exists on 192.168.5.2... local cached dns for endpoint contact is deleted, host marked unreachable
> Timestamp 12:04:00: DNS Lookup foo.vpn.lan using 8.8.8.8 .. fails due to vpn.lan only exists on 192.168.5.2... local cached dns for endpoint contact is deleted, host marked unreachable
> {{monospaced}}
> If you change resolver_unbound.conf to the following:
> {{monospaced}}
> [general]
> hosts = /etc/hosts
> nameserver = 192.168.5.2
> {{monospaced}}
> This does not fix the issue.  Unbound does not respect this as the full nameserver list and still uses /etc/resolv.conf for the 3 nameservers specified
> The ideal behavior here would be:
> 1) Don't treat a contact as unreachable if the DNS suddenly fails, but SIP OPTIONS is still working to the last-known IP
> 2) Try all DNS servers until we get a successful lookup, or all servers have failed lookups
> The only workaround for this is to noload res_resolver_unbound.so



--
This message was sent by Atlassian JIRA
(v6.2#6252)



More information about the asterisk-bugs mailing list