Machine Learning | Nguyen Phong Hoang

Toward Automated DNS Tampering Detection Using Machine Learning

Abstract: DNS manipulation is one of the most prevalent and effective techniques for censoring Internet access and interfering with users' online activities worldwide. Reliable detection of DNS tampering is crucial, but challenging due to evolving censorship tactics and the lack of complete ground truth data. In this paper, we demonstrate the power of machine learning (ML) in addressing these challenges by applying supervised and unsupervised models to recent global DNS measurement data collected by the Open Observatory of Network Interference (OONI).

DeResistor: Toward Detection-Resistant Probing for Evasion of Internet Censorship

Abstract: The arms race between Internet freedom advocates and censors has catalyzed the emergence of sophisticated blocking techniques and directed significant research emphasis toward the development of automated censorship measurement and evasion tools based on packet manipulation. However, we observe that the probing process of censorship middleboxes using state-of-the-art evasion tools can be easily fingerprinted by censors, necessitating detection-resilient probing techniques. We validate our hypothesis by developing a real-time detection approach that utilizes Machine Learning (ML) to detect flow-level packet-manipulation and an algorithm for IP-level detection based on Threshold Random Walk (TRW).

Augmenting Rule-based DNS Censorship Detection at Scale with Machine Learning

Abstract: The proliferation of global censorship has led to the development of a plethora of measurement platforms to monitor and expose it. Censorship of the domain name system (DNS) is a key mechanism used across different countries. It is currently detected by applying heuristics to samples of DNS queries and responses (probes) for specific destinations. These heuristics, however, are both platform-specific and have been found to be brittle when censors change their blocking behavior, necessitating a more reliable automated process for detecting censorship.