While writing a post on Certificate Authority Authorization (CAA) DNS record, I’ve learned about this other DNS thing — a neat hack that makes cache poisoning attacks harder.
First, let’s look at the problem.
Recursive DNS resolvers, like Google’s
184.108.40.206 or one in your home router, will remember for some time answers to the queries they’ve served. In doing so, they make processing much faster, at least for popular domains: when your computer asks for the IP of google.com most of the time your ISP’s DNS won’t have to go search for an answer since someone else probably already asked it before. When combined with the communication mechanisms of DNS this creates a problem.
DNS communications go over UDP protocol, which means that the client doesn’t establish a “session” to the server. It just sends its query into the void of the Internet in the latter’s general direction and then hopes for an answer. (I think I’ve heard of DNS-over-TCP but I know nothing about it except that it exists. Probably.)
When the answer finally arrives, there is a bit of a guessing game involved to figure out if it’s trustworthy:
- of course, client knows for which domains it expects answers and it’ll drop the unexpected ones, but the large public DNS resolvers query most popular domains as soon as cache for them expires
- client can’t trust the source IP of the answer since it can be replaced with the legitimate one (spoofed)
- both request and response contain a transaction ID, but it is only 16 random bits long, which is pretty small and, statistically, easily guessable
And so, we could continuously bombard the client with the bogus query answers
11:00:00.000 hey, the IPv4 of google.com is 127.0.0.1 [transactionId:0x42] 11:00:00.100 hey, the IPv4 of google.com is 127.0.0.1 [transactionId:0x156] 11:00:00.200 hey, the IPv4 of google.com is 127.0.0.1 [transactionId:0x1] 11:00:00.300 hey, the IPv4 of google.com is 127.0.0.1 [transactionId:0x5211] …
in hopes that our answer arrives at the exact moment when a query for google.com was sent, but the legitimate one did not come yet. This way, we make resolver remember our malicious answer and serve it to its clients. We’re poisoning its cache.
One of the ways to make DNS poisoning harder is to add more identifying information to the legitimate answer that is hard for an adversary to predict. That’s what the Use of Bit 0x20 in DNS Labels to Improve Transaction Identity RFC draft is about.
Bit 0x20 Improvement
The core idea is simple and based on two facts:
- while it’s not required by the specification, most of the DNS servers copy the domain from query to answer as-is
- domain names are case-insensitive, so queries for google.com, googlE.COM, and gOOgle.cOm should return the same result
Authors’ insight was to combine these and propose that clients would send themselves information via the server in the casing of the domain’s letters. For example, instead of
what's the IPv6 of google.com? [transactionId:0x3424]
the client would use a randomly capitalised version of google.com
what's the IPv6 of GOglE.com? [transactionId:0x4385]
so it can verify that the case matches in its query and the answer it received.
The difference between lower and upper case in ASCII is in the 6th bit — it is cleared (
0) for the uppercase and set (
1) for lowercase letters.
0x20 is a number with all bits set to
0 except 6th. That’s why this technique got known as “DNS 0x20.” Apparently, it is also called “mixed-case queries.”
In The Present
The draft that proposed this, as far as I understand, never became an adopted RFC. So, strictly speaking, servers are not required to preserve the casing from the query in their answers. But anyway, it looks like most do: Google says that more than 70% of their Public DNS’s traffic goes to servers that support mixed-casing.
It’s kind of ironic that
0x20 introduces complexity to the implementation of other DNS feature created to enhance its security — DNSSEC. Authoritative servers are supposed to sign their answers, but for some reason, they are required to construct and sign answer with the domain in lowercase and then return mixed-case answer plus that signature. Bugs related to this even got mentioned in the recent draft on CAA.
So here you go. Now, when you see a weird hypoTHetIcAl.me DNS query in your Wireshark logs, you’ll know what’s up. 😎