
Research Article
Unsupervised and Adaptive Tor Website Fingerprinting
@INPROCEEDINGS{10.1007/978-3-031-64954-7_11, author={Guoqiang Zhang and Jiahao Cao and Mingwei Xu and Xinhao Deng}, title={Unsupervised and Adaptive Tor Website Fingerprinting}, proceedings={Security and Privacy in Communication Networks. 19th EAI International Conference, SecureComm 2023, Hong Kong, China, October 19-21, 2023, Proceedings, Part II}, proceedings_a={SECURECOMM PART 2}, year={2024}, month={10}, keywords={Tor Multi-source domain adaptation Website fingerprinting Unsupervised Transfer learning}, doi={10.1007/978-3-031-64954-7_11} }
- Guoqiang Zhang
Jiahao Cao
Mingwei Xu
Xinhao Deng
Year: 2024
Unsupervised and Adaptive Tor Website Fingerprinting
SECURECOMM PART 2
Springer
DOI: 10.1007/978-3-031-64954-7_11
Abstract
Over the past few years, deep-learning based approaches for Tor website fingerprinting have experienced a significant breakthrough in prediction accuracy. However, many of these approaches suppose that their training and testing datasets share similar distributions, i.e. they belong to the same domain. Unfortunately, this assumption is unrealistic since Tor users’ distinctive environmental settings have exerted diverse influence on website trace generation. Although several recent methods attempt to address this problem by utilizing transfer learning techniques, they assume that the adversary has some of the trace labels for each website class in the testing dataset, which is typically irrational in real-world scenarios. In this paper, we propose a novel Tor website fingerprinting framework calledUnsupervised andAdaptive Tor WebsiteFingerprinting (UAF), which minimizes the distribution discrepancies between the training (denoted as the source domain) and testing (denoted as the target domain) datasets by training a “domain-invariant” feature extractor in an unsupervised manner. UAF employs three trace representations on raw Tor traffic to retain discriminative information for classification and combines multiple source-specific classifiers based on their trace length distributions. The experimental results show that UAF outperforms multiple state-of-the-art Tor website fingerprinting approaches in identifying shifted and unlabeled target domains.