Security and Privacy in Communication Networks. 13th International Conference, SecureComm 2017, Niagara Falls, ON, Canada, October 22–25, 2017, Proceedings

Research Article

A Deep Learning Based Online Malicious URL and DNS Detection Scheme

  • @INPROCEEDINGS{10.1007/978-3-319-78813-5_22,
        author={Jianguo Jiang and Jiuming Chen and Kim-Kwang Choo and Chao Liu and Kunying Liu and Min Yu and Yongjian Wang},
        title={A Deep Learning Based Online Malicious URL and DNS Detection Scheme},
        proceedings={Security and Privacy in Communication Networks. 13th International Conference, SecureComm 2017, Niagara Falls, ON, Canada, October 22--25, 2017, Proceedings},
        proceedings_a={SECURECOMM},
        year={2018},
        month={4},
        keywords={Network security Malicious URL detection Online detection CNN},
        doi={10.1007/978-3-319-78813-5_22}
    }
    
  • Jianguo Jiang
    Jiuming Chen
    Kim-Kwang Choo
    Chao Liu
    Kunying Liu
    Min Yu
    Yongjian Wang
    Year: 2018
    A Deep Learning Based Online Malicious URL and DNS Detection Scheme
    SECURECOMM
    Springer
    DOI: 10.1007/978-3-319-78813-5_22
Jianguo Jiang1, Jiuming Chen, Kim-Kwang Choo2, Chao Liu1, Kunying Liu1, Min Yu,*, Yongjian Wang3,*
  • 1: Chinese Academy of Sciences
  • 2: University of Texas at San Antonio
  • 3: The Third Research Institute of Ministry of Public Security
*Contact email: yumin@iie.ac.cn, wangyongjian@stars.org.cn

Abstract

URL and DNS are two common attack vectors in malicious network activities; thus, detection for malicious URL and DNS is crucial in network security. In this paper, we propose an online detection scheme based on character-level deep neural networks. Specifically, this scheme maps the URL and DNS strings into vector form using some natural language processing methods. The CNN (Convolutional Neural Network) network framework is then designed to automatically extract the malicious features and train the classifying model. Experimental results on real-world URL and DNS datasets show that proposed method outperforms several state-of-art baseline methods, in terms of efficiency and scalability.