dig ds c10r.facebook.com returns SERVFAIL

Discussion:

Laurent Bigonville

2018-09-03 18:12:27 UTC

Hello,

With bind9 server (I can reproduce that on RHEL7 with 9.9.4, debian
stable with 9.10.3 and also debian unstable with 9.11.4) when doing "dig
ds c10r.facebook.com @10.122.17.186", I get a SERVFAIL.

I'm getting this with either a bind acting as a forwarder or as a
recursive server. In the forwarder case, if I'm doing the query on the
servers where the query is forwarded to, it is working fine.

I get the same result if I'm doing the query on the server at
https://www.dns-oarc.net/oarc/services/odvr

I've attached here the capture of the network traffic when bind is
acting as a forwarder (option forward=first so that's why you see a
query to facebook server as well)

An idea?

Kind regards,

Laurent Bigonville

Tony Finch

2018-09-03 19:03:50 UTC

Permalink

With bind9 server (I can reproduce that on RHEL7 with 9.9.4, debian stable
with 9.10.3 and also debian unstable with 9.11.4) when doing "dig ds

This is because the authoritative servers for facebook.com do not
implement any DNSSEC, so they don't know that DS records are found on the
parent side of a zone cut, so they return a referral instead of a negative
answer. BIND treats this as a server failure, and does not attempt to work
around the antediluvian ignorance of the auth servers. In practice it
shouldn't matter since there shouldn't be any signed zones underneath a
server that doesn't know about DNSSEC.

Tony.
--
f.anthony.n.finch <***@dotat.at> http://dotat.at/
Viking, North Utsire: Cyclonic, becoming northerly, 3 or 4, occasionally 5 at
first. Slight or moderate. Rain until later. Moderate or poor, occasionally
good later.
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
bind-***@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Laurent Bigonville

2018-09-03 20:26:35 UTC

Permalink

Post by Tony Finch

With bind9 server (I can reproduce that on RHEL7 with 9.9.4, debian stable
with 9.10.3 and also debian unstable with 9.11.4) when doing "dig ds

The problem is that systemd-resolved (maybe other software are doing the
same?) is asking the DS record to check if the record is supposed to be
signed (well I think) before trying to do DNSSEC validation of the
client side.

I'm also wondering (and pardon my ignorance but), why does bind tries
all the forwarders and the the auth server if the 1st server already
reply with an empty answer and an NOERROR?

_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
bind-***@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Tony Finch

2018-09-03 21:38:50 UTC

Permalink

I am shocked (appalled!) and surprised (gobsmscked!) to learn that systemd-resolved is using an unwise and brittle validation algorithm.

(The Right Thing for a validating stub like systemd-resolved is to concurrently send DS+DNSKEY queries for all the possible zone cuts above the qname, at the same time as the original query, then validate top-down. This costs 1 RTT (or 2 RTT if there are CNAMEs or SRVs etc.) and avoids backwards compatibility problems at the lower levels of the tree. Minimal latency overhead, tho you need to validate as the answers arrive so you can bail out early for insecure domains in order to avoid getting stuck due to servers that drop unknown QTYPEs or other brokenness like c10r. The extra bonus for DNS-over-TCP/TLS/HTTPS is that the concurrent queries can be sent in one go: one TLS record, one write() syscall, one TCP segment.)

I'm also wondering (and pardon my ignorance but), why does bind tries all the forwarders and the the auth server if the 1st server already reply with an empty answer and an NOERROR?

When BIND is forwarding it just gets SERVFAIL from the upstream resolver, and it can’t tell if that is because the upstream resolver is broken or the authoritative servers, so it tries the next resolver. When BIND is doing iterative resolution talking directly to auth servers, it gives up immediately when it gets the unexpected referral. I don’t know why it doesn’t retry in this case; possibly because the DS logic is a bit special. There aren’t any NOERROR/NODATA responses in this scenario.

Tony.
--
f.anthony.n.finch <***@dotat.at> http://dotat.at

_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
bind-***@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Laurent Bigonville

2018-09-03 22:40:29 UTC

Permalink

Post by Tony Finch

I am shocked (appalled!) and surprised (gobsmscked!) to learn that systemd-resolved is using an unwise and brittle validation algorithm.
(The Right Thing for a validating stub like systemd-resolved is to concurrently send DS+DNSKEY queries for all the possible zone cuts above the qname, at the same time as the original query, then validate top-down. This costs 1 RTT (or 2 RTT if there are CNAMEs or SRVs etc.) and avoids backwards compatibility problems at the lower levels of the tree. Minimal latency overhead, tho you need to validate as the answers arrive so you can bail out early for insecure domains in order to avoid getting stuck due to servers that drop unknown QTYPEs or other brokenness like c10r. The extra bonus for DNS-over-TCP/TLS/HTTPS is that the concurrent queries can be sent in one go: one TLS record, one write() syscall, one TCP segment.)

Don't take what I said about the internal working of systemd-resolved
for granted :)

Looking at the log that I initially provided
(https://github.com/systemd/systemd/issues/8897), it seems to revalidate
the complete chain.

An idea what should be done to fix this then?

_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
bind-***@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Mark Andrews

2018-09-03 23:03:41 UTC

Permalink

Post by Tony Finch

I am shocked (appalled!) and surprised (gobsmscked!) to learn that systemd-resolved is using an unwise and brittle validation algorithm.
(The Right Thing for a validating stub like systemd-resolved is to concurrently send DS+DNSKEY queries for all the possible zone cuts above the qname, at the same time as the original query, then validate top-down. This costs 1 RTT (or 2 RTT if there are CNAMEs or SRVs etc.) and avoids backwards compatibility problems at the lower levels of the tree. Minimal latency overhead, tho you need to validate as the answers arrive so you can bail out early for insecure domains in order to avoid getting stuck due to servers that drop unknown QTYPEs or other brokenness like c10r. The extra bonus for DNS-over-TCP/TLS/HTTPS is that the concurrent queries can be sent in one go: one TLS record, one write() syscall, one TCP segment.)

Upgrade facebook’s DNS servers to ones that are DNSSEC aware. DS lives in the parent
zone at a zone cut. STD 13 servers don’t know this. DNSSEC aware ones do know this.
DS lookups will work where all the recursive servers are DNSSEC aware and the delegating
servers are DNSSEC aware. DS lookups will also work for non delegation points (NODATA
responses are expected) and for non existent names (NXDOMAIN).

DS lookups will fail at a delegation point if the delegating server is not DNSSEC aware.

Mark

_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list
bind-users mailing list
https://lists.isc.org/mailman/listinfo/bind-users

--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: ***@isc.org

_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
bind-***@lists.isc.or

Tony Finch

2018-09-04 09:06:32 UTC

Permalink

Yes, you are right, I shouldn't have immediately gone for the full blast
of sarcasm without verifying that systemd-resolved deserves it. So I
looked at the log - details below. (Spoiler: my prejudices have been
confirmed.)

An idea what should be done to fix this then?

Well, the good options are to fix Facebook (as Mark rightly said) and to
fix systemd-resolved. Alternatively you can add negative trust anchors for
broken domains like Facebook.

OK, logs. After a lot of setup faff we have:

16:24:21 Switching to system DNS server 10.200.0.200.

16:24:23 Cache miss for www.facebook.com IN A
16:24:23 Transaction 41850 for <www.facebook.com IN A> scope dns on */*.
16:24:23 Using DNS server 10.200.0.200 for transaction 41850.
16:24:23 Timeout reached on transaction 41850.

That's a remarkably hair-trigger timeout.

16:24:23 Switching to system DNS server 10.122.17.186.
16:24:23 Transaction 41850 for <www.facebook.com IN A> scope dns on */*.
16:24:23 Processing incoming packet on transaction 41850. (rcode=SUCCESS)
16:24:23 Verified we get a response at feature level UDP+EDNS0+DO from DNS server 10.122.17.186.

OK so we know at this point that systemd-resolved is not designed for fast
validation, because it hasn't sent the queries for the validation chain
yet. A big shame for new code.

16:24:23 Requesting parent SOA to validate transaction 41850 (www.facebook.com, unsigned CNAME/DNAME/DS RRset).
16:24:23 Transaction 60936 for <facebook.com IN SOA> scope dns on */*.

Wat? How does a SOA query help anything? There's no point wasting time
looking for zone cuts before you request DNSKEY and DS records, because
the DNSKEY and DS responses tell you where the zone cuts are as a side
effect. This is just a waste of time.

16:24:23 Requesting DS to validate transaction 41850 (c10r.facebook.com, unsigned SOA/NS RRset).
16:24:23 Transaction 36881 for <c10r.facebook.com IN DS> scope dns on */*.
16:24:23 Requesting DS to validate transaction 41850 (c10r.facebook.com, unsigned SOA/NS RRset).

Twice??

16:24:23 Processing incoming packet on transaction 60936. (rcode=SUCCESS)
16:24:23 Requesting DS to validate transaction 60936 (facebook.com, unsigned SOA/NS RRset).
16:24:23 Transaction 35625 for <facebook.com IN DS> scope dns on */*.
16:24:23 Processing incoming packet on transaction 35625. (rcode=SUCCESS)
16:24:23 Requesting DNSKEY to validate transaction 35625 (com, RRSIG with key tag: 36707).

Then there's a lot of upwards validation faff for com and root zones.

16:24:23 Found verdict for lookup facebook.com IN DS: insecure
16:24:23 Added NODATA cache entry for facebook.com IN DS 105s
16:24:23 Transaction 35625 for <facebook.com IN DS> on scope dns on */* now complete with <success> from network (unsigned).
16:24:23 Transaction 60936 for <facebook.com IN SOA> on scope dns on */* now complete with <success> from network (unsigned).

OK so far.

16:24:24 Timeout reached on transaction 36881.
16:24:24 Retrying transaction 36881.

At this point systemd-resolved should have abandoned transaction 36881:
facebook.com is insecure so the c10r DS is immaterial.

It then spends another 1.5 minutes (!!!) retrying 36881. If you get a
SERVFAIL from one recursive server, it's reasonable to retry on
alternative recursive servers if you have them, but it's almost always
futile to retry against the same server. systemd-resolved needs to give up
way faster.

It seems to be using SERVFAIL as a feture negotiation signal. Weirdly,
it doesn't reduce the LARGE buffer size feature on timeout (which would
make sense) but only after it gets the first SERVFAIL response (which
doesn't make sense). It also tries to make a DS query with DO=0 which is
nonsense.

16:25:52 Transaction 36881 for <c10r.facebook.com IN DS> on scope dns on */* now complete with <attempts-max-reached> from network (unsigned).
16:25:52 Auxiliary DNSSEC RR query failed with attempts-max-reached

Sheesh. At long last!

16:25:52 DNSSEC validation failed for question www.facebook.com IN A: failed-auxiliary
16:25:52 Transaction 41850 for <www.facebook.com IN A> on scope dns on */* now complete with <dnssec-failed> from network (unsigned).

WRONG. You already validated it insecure! Good grief.

Tony.
--
f.anthony.n.finch <***@dotat.at> http://dotat.at/
Shannon: Northerly or northwesterly 3 or 4, backing westerly or southwesterly
4 or 5 in northwest. Moderate. Rain later in northwest. Good, occasionally
moderate later in northwest.
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
bind-***@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users