Most online communications rely on DNS to map domain names to their hosting IP address(es). Previous work has shown that DNS-based network interference is widespread due to the unencrypted and unauthenticated nature of the original DNS protocol. In addition to DNS, accessed domain names can also be monitored by on-path observers during the TLS handshake when the SNI extension is used. These lingering issues with exposed plaintext domain names have led to the development of a new generation of protocols that keep accessed domain names hidden. DNS-over-TLS (DoT) and DNS-over-HTTPS (DoH) hide the domain names of DNS queries, while Encrypted Server Name Indication (ESNI) encrypts the domain name in the SNI extension. We present DNEye, a measurement system built on top of a network of distributed vantage points, which we used to study the accessibility of DoT/DoH and ESNI, and to investigate whether these protocols are tampered with by network providers (e.g., for censorship). Moreover, we evaluate the efficacy of these protocols in circumventing network interference when accessing content blocked by traditional DNS manipulation. We find evidence of blocking efforts against domain name encryption technologies in several countries, including China, Russia, and Saudi Arabia. At the same time, we discover that domain name encryption can help with unblocking more than 55% and 95% of censored domains in China and other countries where DNS-based filtering is heavily employed.
This blog post describes the dataset used in our paper. A README.md file and the dataset can be downloaded from this Google Drive.
DNEye has access to about 20.7K VPs in 85 countries. The cc_asn_vp.csv
file
contains the information about the number of vantage points we have access to
per each ASN in each country. We also append their freedom house classification
(i.e., free, not free, partly free).
CC, ASN, total_IPs, Freedomhouse_class
AE, AS5384, 176, Not Free
The above sample line shows that we have access to 176 unique vantage point IPs in AS5384 in the United Arab Emirates (AE).
We first look for domains reported as censored by three active censorship
measurement platforms, namely, ICLab, OONI, and Censored Planet, in the past 30
days and visit them to confirm their online status. We consider domains that are
censored in at least two ASes per country, and reported by at least two
platforms. This helps in eliminating unreliable data points that could have been
caused by generic network errors instead of network interference. To that end,
we obtain 1.5K domains that are commonly reported as censored by these
measurement platforms in 77 countries where we have VPs. The
total_urls_tested.csv
file contains the list of all unique urls that our
platform performed measurements on.
DNEye issues DNS queries for the domains in our test list. These measurements allow us to not only obtain an updated view of DNS-based network interference across network locations, but also determine whether there are any filtering systems that block these DoTH resolvers via DNS tampering. DNEye sends DNS queries to both public DNS resolvers (e.g., Google and Cloudflare) and the local DNS resolver configured by each VP’s network provider
We provide a dns_YYY-MM-DD.json
file for each day of our measurements which
contains the detailed information that was parsed from the packet capture during
the DNS measurements. The following is an example entry of this file.
{
"domain": "ordns.he.net",
"qtype": "A",
"cc": "CN",
"asn": "4837",
"agg_asn": {
"6939": [
21735,
63,
383
],
"3356": [
83,
1,
3
],
"4768": [
66,
1,
3
],
"RESERVED_243.185.187.39": [
74,
1,
3
],
"4766": [
76,
1,
3
],
"16509": [
10,
1,
1
],
"ServFail": [
393,
1,
25
],
"PRIVATE_10.101.101.100": [
4,
1,
1
]
},
"true_asns": [
"6939"
],
"control_res": [
"6939"
],
"resolver_results": {
"local1": {
"port1": {
"censored": true,
"reason": "Aggregation",
"ip_asn": {
"4766": 1
}
},
"port2": {
"censored": true,
"reason": "Aggregation",
"ip_asn": {
"4766": 1
}
},
"port3": {
"censored": true,
"reason": "Aggregation",
"ip_asn": {
"3356": 1
}
}
}
}
}
Each line has a dictionary with the following keys:
domain
specifies the domain name from the DNS request.
qtype
specifies the query type (i.e., A, AAAA).
cc
and asn
specify the country and ASN of the vantage point.
agg_asn
is a dictionary itself and aggregates all the response ASNs from our
vantage points. The keys of agg_asn are the different response ASes (ASN of the
response IPs) we observe from resolving this domain from all our vantage points.
The values of agg_asn are: [the number of times each vantage point got an IP in
that ASN, the number of countries we received an IP in that ASN, the number of
(vantage point, ASN) pairs we received that IP in that ASN]. For example “4768”:
[66, 1, 3], shows that we have 66 measurements that had a response IP in AS4768
as a result of our DNS query, these vantage points were in 1 country and we had
a total of 3 (vantage point, ASN) pairs.
true_asns
is the list of response ASNs that we believe are the legitimate
response ASN for that domain name. We obtain this from analyzing the agg_asn and
if we see a response ASN from more than 1 country then that is an indication
that it is the legitimate ASN for that domain. control_res specifies the
response ASN we received from our control node when resolving the domain name.
The final key in this dictionary is resolver_results
. The key values of this
entry are the different resolvers we issue the queries to (These could be local
resolvers whose IPs are anonymized or public resolvers such as 8.8.8.8). For
each resolver, we have an entry for each DNS query based on the port. censored
specifies whether we believe this DNS query was tampered with or not. reason
states the reason why we flagged the DNS response as censored and ip_asn
specifies the number of response IPs we observed and their ASes.
DNEye then utilizes the kdig command to send encrypted DNS queries to 71 DoTH
resolvers to resolve a control domain for which we know the correct answer. This
test checks whether each DoTH server returns the correct IP of our control
domain We provide a doth_YYY-MM-DD.json
file for each day of our measurements
which contains the detailed information about the DoT/DoH measurements. The
following is an example entry of this file. If our DoT/DoH query was successful
and we were able to receive the correct control IP of our control domain we
store the following entry:
{
"vp_cc": "BY",
"vp_asn": "42772",
"doth_server": "dnses.alekberg.net",
"port": 443,
"success": true
}
vp_cc
and vp_asn
contain the country and AS information of the vantage
point.
doth_server
contains the domain name of the DoT/DoH server we use to issue
our query and port
indicates the port on which we connected to the DoT/DoH
server.
If our query was not successful, we analyze the packet capture to identify the cause of the failure. In that case we store the following line:
{
"vp_cc": "CN",
"vp_asn": "37963",
"doth_server": "jp.tiarap.org",
"port": 443,
"success": false,
"kdig_err": ";; WARNING: TLS, handshake failed (Error in the\n pull function.)",
"packets": {
"104.21.65.60:443|48": {
"doth_ip": "104.21.65.60",
"tcp_stream": "48",
"tcp_hs": true,
"tls_hs": false,
"packet_state": [
[
"C:TCP:S",
1
],
[
"S:TCP:AS",
1
],
[
"C:TCP:A",
1
],
[
"C:TLSv1.2:Handshake/CLIENT_HELLO",
1
],
[
"S:TCP:AR",
4
],
[
"S:TCP:AF",
1
],
[
"S:TCP:AR",
1
],
[
"C:TCP:R",
1
],
[
"S:TCP:AR",
1
],
[
"S:TCP:R",
3
]
],
"tcp_anomaly": [
false,
null
],
"tls_anomaly": [
true,
"S:TCP:AR"
],
"sni": "doh.tiarap.org"
}
}
}
Similar to the previous case, vp_cc
and vp_asn
contain the country and AS
information of the vantage point. The doth_server
and port
contain the doth
server domain name and port.
kdig_err
prints the error that was received when running the kdig command.
packets is a dictionary, its keys are [IP of the doth server:port|tcp stream],
its values are the following:
doth_ip
has the IP of the doth server, tcp_stream
is the tcp stream number
of this connection, tcp_hs
and tls_hs
indicate whether the TCP and TLS
handshake were successful and finally, packet_state
has the detailed packet
information of the connection in the form of a list. We show each packet as a
pair of (three-tuple, number of packets). The first object in the tuple
indicates whether the packet originated from the client (C
) or server(S
).
The second object helps in finding out if the packet belongs before the TLS
handshake (we show these packets with TCP
) or the packet belongs to the TLS
connection (as shown by TLS
). The last object in the tuple contains the TCP
flags associated with that packet (e.g., AR
indicates that the ACK
and
RST
flags are set).
tcp_anomaly
and tls_anomaly
show whether we observed
an anomaly in the TCP or TLS handshake and finally, sni
contains the server
name indication fields of the TLS connection to the doth server.
We provide the list of domains that support esni in
esni_supported_domaints.txt
These are the set of domains from the TLD zone
files that respond to our ESNI TXT query for _esni.domain with their ESNI key.
In this measurement we attempt to connect to our control domain that supports
ESNI from our vantage points. Our control domain will print the IP of the
client, we use this information to find out whether ESNI is blocked from that VP
or not. The following is an example entry of this file. We provide an
esni_YYY-MM-DD.json
file for each day of our measurements which contains the
information about the ESNI measurements.
{"cc": "ID", "asn": "9341", "date": "2020-11-12", "success": false}
{"cc": "ID", "asn": "17451", "date": "2020-11-12", "success": true}
cc
and asn
has the country and AS information of the vantage point, date
contains the date of the measurement and finally, success
indicates whether
the ESNI connection was successful or not.