Measuring the Accessibility of Domain Name Encryption and its Impact on Internet Filtering

Nguyen Phong Hoang, Michalis Polychronakis, Phillipa Gill

March 2022

PDF Slides Preprint [PAM '22]

Abstract

Most online communications rely on DNS to map domain names to their hosting IP address(es). Previous work has shown that DNS-based network interference is widespread due to the unencrypted and unauthenticated nature of the original DNS protocol. In addition to DNS, accessed domain names can also be monitored by on-path observers during the TLS handshake when the SNI extension is used. These lingering issues with exposed plaintext domain names have led to the development of a new generation of protocols that keep accessed domain names hidden. DNS-over-TLS (DoT) and DNS-over-HTTPS (DoH) hide the domain names of DNS queries, while Encrypted Server Name Indication (ESNI) encrypts the domain name in the SNI extension. We present DNEye, a measurement system built on top of a network of distributed vantage points, which we used to study the accessibility of DoT/DoH and ESNI, and to investigate whether these protocols are tampered with by network providers (e.g., for censorship). Moreover, we evaluate the efficacy of these protocols in circumventing network interference when accessing content blocked by traditional DNS manipulation. We find evidence of blocking efforts against domain name encryption technologies in several countries, including China, Russia, and Saudi Arabia. At the same time, we discover that domain name encryption can help with unblocking more than 55% and 95% of censored domains in China and other countries where DNS-based filtering is heavily employed.

Type

Conference paper

Publication

The International Conference on Passive and Active Network Measurement

This blog post describes the dataset used in our paper. A README.md file and the dataset can be downloaded from this Google Drive.

0. Vantage Points

DNEye has access to about 20.7K VPs in 85 countries. The cc_asn_vp.csv file contains the information about the number of vantage points we have access to per each ASN in each country. We also append their freedom house classification (i.e., free, not free, partly free).

CC, ASN, total_IPs, Freedomhouse_class
AE, AS5384, 176, Not Free

The above sample line shows that we have access to 176 unique vantage point IPs in AS5384 in the United Arab Emirates (AE).

1. Test List

We first look for domains reported as censored by three active censorship measurement platforms, namely, ICLab, OONI, and Censored Planet, in the past 30 days and visit them to confirm their online status. We consider domains that are censored in at least two ASes per country, and reported by at least two platforms. This helps in eliminating unreliable data points that could have been caused by generic network errors instead of network interference. To that end, we obtain 1.5K domains that are commonly reported as censored by these measurement platforms in 77 countries where we have VPs. The total_urls_tested.csv file contains the list of all unique urls that our platform performed measurements on.

2. Measurements

2.1 DNS Based Measurements

DNEye issues DNS queries for the domains in our test list. These measurements allow us to not only obtain an updated view of DNS-based network interference across network locations, but also determine whether there are any filtering systems that block these DoTH resolvers via DNS tampering. DNEye sends DNS queries to both public DNS resolvers (e.g., Google and Cloudflare) and the local DNS resolver configured by each VP’s network provider

We provide a dns_YYY-MM-DD.json file for each day of our measurements which contains the detailed information that was parsed from the packet capture during the DNS measurements. The following is an example entry of this file.

{
  "domain": "ordns.he.net",
  "qtype": "A",
  "cc": "CN",
  "asn": "4837",
  "agg_asn": {
    "6939": [
      21735,
      63,
      383
    ],
    "3356": [
      83,
      1,
      3
    ],
    "4768": [
      66,
      1,
      3
    ],
    "RESERVED_243.185.187.39": [
      74,
      1,
      3
    ],
    "4766": [
      76,
      1,
      3
    ],
    "16509": [
      10,
      1,
      1
    ],
    "ServFail": [
      393,
      1,
      25
    ],
    "PRIVATE_10.101.101.100": [
      4,
      1,
      1
    ]
  },
  "true_asns": [
    "6939"
  ],
  "control_res": [
    "6939"
  ],
  "resolver_results": {
    "local1": {
      "port1": {
        "censored": true,
        "reason": "Aggregation",
        "ip_asn": {
          "4766": 1
        }
      },
      "port2": {
        "censored": true,
        "reason": "Aggregation",
        "ip_asn": {
          "4766": 1
        }
      },
      "port3": {
        "censored": true,
        "reason": "Aggregation",
        "ip_asn": {
          "3356": 1
        }
      }
    }
  }
}

Each line has a dictionary with the following keys:

domain specifies the domain name from the DNS request.
qtype specifies the query type (i.e., A, AAAA).
cc and asn specify the country and ASN of the vantage point.
agg_asn is a dictionary itself and aggregates all the response ASNs from our vantage points. The keys of agg_asn are the different response ASes (ASN of the response IPs) we observe from resolving this domain from all our vantage points. The values of agg_asn are: [the number of times each vantage point got an IP in that ASN, the number of countries we received an IP in that ASN, the number of (vantage point, ASN) pairs we received that IP in that ASN]. For example “4768”: [66, 1, 3], shows that we have 66 measurements that had a response IP in AS4768 as a result of our DNS query, these vantage points were in 1 country and we had a total of 3 (vantage point, ASN) pairs.
true_asns is the list of response ASNs that we believe are the legitimate response ASN for that domain name. We obtain this from analyzing the agg_asn and if we see a response ASN from more than 1 country then that is an indication that it is the legitimate ASN for that domain. control_res specifies the response ASN we received from our control node when resolving the domain name.
The final key in this dictionary is resolver_results. The key values of this entry are the different resolvers we issue the queries to (These could be local resolvers whose IPs are anonymized or public resolvers such as 8.8.8.8). For each resolver, we have an entry for each DNS query based on the port. censored specifies whether we believe this DNS query was tampered with or not. reason states the reason why we flagged the DNS response as censored and ip_asn specifies the number of response IPs we observed and their ASes.

2.2 DoT/DoH Based Measurements

DNEye then utilizes the kdig command to send encrypted DNS queries to 71 DoTH resolvers to resolve a control domain for which we know the correct answer. This test checks whether each DoTH server returns the correct IP of our control domain We provide a doth_YYY-MM-DD.json file for each day of our measurements which contains the detailed information about the DoT/DoH measurements. The following is an example entry of this file. If our DoT/DoH query was successful and we were able to receive the correct control IP of our control domain we store the following entry:

{
  "vp_cc": "BY",
  "vp_asn": "42772",
  "doth_server": "dnses.alekberg.net",
  "port": 443,
  "success": true
}

vp_cc and vp_asn contain the country and AS information of the vantage point.
doth_server contains the domain name of the DoT/DoH server we use to issue our query and port indicates the port on which we connected to the DoT/DoH server.

If our query was not successful, we analyze the packet capture to identify the cause of the failure. In that case we store the following line:

{
  "vp_cc": "CN",
  "vp_asn": "37963",
  "doth_server": "jp.tiarap.org",
  "port": 443,
  "success": false,
  "kdig_err": ";; WARNING: TLS, handshake failed (Error in the\n pull function.)",
  "packets": {
    "104.21.65.60:443|48": {
      "doth_ip": "104.21.65.60",
      "tcp_stream": "48",
      "tcp_hs": true,
      "tls_hs": false,
      "packet_state": [
        [
          "C:TCP:S",
          1
        ],
        [
          "S:TCP:AS",
          1
        ],
        [
          "C:TCP:A",
          1
        ],
        [
          "C:TLSv1.2:Handshake/CLIENT_HELLO",
          1
        ],
        [
          "S:TCP:AR",
          4
        ],
        [
          "S:TCP:AF",
          1
        ],
        [
          "S:TCP:AR",
          1
        ],
        [
          "C:TCP:R",
          1
        ],
        [
          "S:TCP:AR",
          1
        ],
        [
          "S:TCP:R",
          3
        ]
      ],
      "tcp_anomaly": [
        false,
        null
      ],
      "tls_anomaly": [
        true,
        "S:TCP:AR"
      ],
      "sni": "doh.tiarap.org"
    }
  }
}

Similar to the previous case, vp_cc and vp_asn contain the country and AS information of the vantage point. The doth_server and port contain the doth server domain name and port.

kdig_err prints the error that was received when running the kdig command. packets is a dictionary, its keys are [IP of the doth server:port|tcp stream], its values are the following:

doth_ip has the IP of the doth server, tcp_stream is the tcp stream number of this connection, tcp_hs and tls_hs indicate whether the TCP and TLS handshake were successful and finally, packet_state has the detailed packet information of the connection in the form of a list. We show each packet as a pair of (three-tuple, number of packets). The first object in the tuple indicates whether the packet originated from the client (C) or server(S). The second object helps in finding out if the packet belongs before the TLS handshake (we show these packets with TCP) or the packet belongs to the TLS connection (as shown by TLS). The last object in the tuple contains the TCP flags associated with that packet (e.g., AR indicates that the ACK and RST flags are set).

tcp_anomaly and tls_anomaly show whether we observed an anomaly in the TCP or TLS handshake and finally, sni contains the server name indication fields of the TLS connection to the doth server.

2.3 ESNI adoption

We provide the list of domains that support esni in esni_supported_domaints.txt These are the set of domains from the TLD zone files that respond to our ESNI TXT query for _esni.domain with their ESNI key.

2.4 ESNI filtering measurements

In this measurement we attempt to connect to our control domain that supports ESNI from our vantage points. Our control domain will print the IP of the client, we use this information to find out whether ESNI is blocked from that VP or not. The following is an example entry of this file. We provide an esni_YYY-MM-DD.json file for each day of our measurements which contains the information about the ESNI measurements.

{"cc": "ID", "asn": "9341", "date": "2020-11-12", "success": false}
{"cc": "ID", "asn": "17451", "date": "2020-11-12", "success": true}

cc and asn has the country and AS information of the vantage point, date contains the date of the measurement and finally, success indicates whether the ESNI connection was successful or not.