
Research Article
Adversarial Attack on Scene Text Recognition Based on Adversarial Networks
@INPROCEEDINGS{10.1007/978-3-031-50580-5_5, author={Yanju Liu and Xinhai Yi and Yange Li and Bing Wang and Huiyu Zhang and Yanzhong Liu}, title={Adversarial Attack on Scene Text Recognition Based on Adversarial Networks}, proceedings={Multimedia Technology and Enhanced Learning. 5th EAI International Conference, ICMTEL 2023, Leicester, UK, April 28-29, 2023, Proceedings, Part IV}, proceedings_a={ICMTEL PART 4}, year={2024}, month={2}, keywords={Deep learning Text recognition AdvGAN Adversarial examples Natural scene}, doi={10.1007/978-3-031-50580-5_5} }
- Yanju Liu
Xinhai Yi
Yange Li
Bing Wang
Huiyu Zhang
Yanzhong Liu
Year: 2024
Adversarial Attack on Scene Text Recognition Based on Adversarial Networks
ICMTEL PART 4
Springer
DOI: 10.1007/978-3-031-50580-5_5
Abstract
Deep learning further improves the recognition performance of scene text recognition technology, but it also faces many problems, such as complex lighting, blurring, and so on. The vulnerability of deep learning models to subtle noise has been proven. However, the problems faced by the above scene text recognition technology are likely to become a adversarial sample leading to text recognition model recognition errors. An effective measure is to add adversarial samples to the training set to train the model, so studying adversarial attacks is very meaningful. Current attack models mostly rely on manual design parameters. When generating adversarial samples, continuous gradient calculation is required on the original samples. Most of them are for non-sequential tasks such as classification tasks. Few attack models are for sequential tasks such as scene text recognition. This paper reduces the time complexity of generating adversarial samples to O(1) level by using the Adversarial network to semi-white box attack on the scene text recognition model. And a new objective function for sequence model is proposed. The attack success rates of the adversarial samples on the IC03 and IC13 datasets were 85.28% and 86.98% respectively, while ensuring a structural similarity of over 90% between the original samples and the adversarial samples.