Machine Learning and Intelligent Communications. 4th International Conference, MLICOM 2019, Nanjing, China, August 24–25, 2019, Proceedings

Research Article

Robustness Analysis on Natural Language Processing Based AI Q&A Robots

Download
106 downloads
  • @INPROCEEDINGS{10.1007/978-3-030-32388-2_57,
        author={Chengxiang Yuan and Mingfu Xue and Lingling Zhang and Heyi Wu},
        title={Robustness Analysis on Natural Language Processing Based AI Q\&A Robots},
        proceedings={Machine Learning and Intelligent Communications. 4th International Conference, MLICOM 2019, Nanjing, China, August 24--25, 2019, Proceedings},
        proceedings_a={MLICOM},
        year={2019},
        month={10},
        keywords={AI security Question and answer robots Robustness Natural language processing},
        doi={10.1007/978-3-030-32388-2_57}
    }
    
  • Chengxiang Yuan
    Mingfu Xue
    Lingling Zhang
    Heyi Wu
    Year: 2019
    Robustness Analysis on Natural Language Processing Based AI Q&A Robots
    MLICOM
    Springer
    DOI: 10.1007/978-3-030-32388-2_57
Chengxiang Yuan1,*, Mingfu Xue1,*, Lingling Zhang1,*, Heyi Wu2,*
  • 1: Nanjing University of Aeronautics and Astronautics
  • 2: Southeast University
*Contact email: yuancx@nuaa.edu.cn, mingfu.xue@nuaa.edu.cn, bluezhll@126.com, why1988seu@126.com

Abstract

Recently, the natural language processing (NLP) based intelligent question and answering (Q&A) robots have been used in a wide range of applications, such as smart assistant, smart customer service, government business. However, the robustness and security issues of these NLP based artificial intelligence (AI) Q&A robots have not been studied yet. In this paper, we analyze the robustness problems in current Q&A robots, which include four aspects: (1) semantic slot settings are incomplete; (2) sensitive words are not filtered efficiently and completely; (3) Q&A robots return the search results directly; (4) unsatisfactory matching algorithms and inappropriate matching threshold settings. Then, we design and implement two types of evaluation tests, bad language and user’s typos, to evaluate the robustness of several state-of-the-art Q&A robots. Experiment results show that these common inputs (bad language and user’s typos) can successfully make these Q&A robots malfunction, denial of service, or speaking dirty words. Besides, we also propose possible countermeasures to enhance the robustness of these Q&A robots. To the best of the authors’ knowledge, this is the first work on analyzing the robustness and security problems of intelligent Q&A robots. This work can hopefully help provide guidelines to design robust and secure Q&A robots.