Knowledge distillation of convolutional neural network models on inaccurately labeled data for automatic text CAPTCHA recognition on mobile devices

350 rub

Journal Dynamics of Complex Systems - XXI century №3 for 2021 г.

Article in number:

Type of article: overview article

DOI: 10.18127/j19997493-202103-01

UDC: 004.931

Keywords: Text CAPTCHA recognition convolutional neural network deep learning knowledge distillation

Authors:

V.I. Terekhov1, D.O. Ishkov2

1–3 Bauman Moscow State Technical University (Moscow, Russia)

Abstract:

Problem definition: most of the existing works investigate the recognition of a fixed-length CAPTCHA, but the authors suggest using knowledge distillation to simulate the operation of recurrent-convolutional models, which have proven themselves well in the task of predicting the dynamic length of characters in images. The rapid development of deep learning systems, the recognition quality of which has reached the level of human vision, makes the method of protection using CAPTCHA increasingly ineffective. In addition, such protection imposes high requirements on the characteristics of the devices on which recognition is performed. The research carried out in this work allowed us to propose an effective method of training CNN on inaccurate data for automatic circumvention of text CAPTCHAS on mobile devices.

Purpose: acquiring a lightweight and high-quality model for text CAPTCHA recognition that can work on mobile devices.

Results: the paper describes a method for training a lightweight model on inaccurate markup obtained from another model. The influence of the size of the training sample on the quality of recognition, the speed of the model on various end devices is studied on the example of a popular social network.

Practical significance: The proposed method allows you to train convolutional models to bypass the protection of websites-text CAPTCHA, which are undemanding to the characteristics of devices. The analysis of the model errors allows us to make recommendations for improving ways to counteract automatic recognition.

Pages: 5-19

For citation

Terekhov V.I., Ishkov D.O. Knowledge distillation of convolutional neural network models on inaccurately labeled data for automatic text CAPTCHA recognition on mobile devices. Dynamics of complex systems. 2021. T. 15. № 3. Р. 5−19. DOI: 10.18127/j19997493-202103-01 (In Russian)

References

Klopfenstein L.C. et al. The rise of bots: A survey of conversational interfaces, patterns, and paradigms. Proceedings of the 2017 conference on designing interactive systems. 2017. P. 555–565.
Von Ahn L. et al. CAPTCHA: Using hard AI problems for security. International conference on the theory and applications of cryptographic techniques. Springer, Berlin, Heidelberg, 2003. P. 294–311.
Zhao B. et al. Towards evaluating the security of real-world deployed image CAPTCHAs. Proceedings of the 11th ACM Workshop on Artificial Intelligence and Security. 2018. P. 85–96.
Wang J. et al. CAPTCHA recognition based on deep convolutional neural network. Math. Biosci. Eng. 2019. V. 16. № 5. P. 5851–5861.
Zi Y. et al. An end-to-end attack on text CAPTCHAs. IEEE Transactions on Information Forensics and Security. 2019. V. 15. P. 753–766.
Bursztein E., Martin M., Mitchell J. Text-based CAPTCHA strengths and weaknesses. Proceedings of the 18th ACM conference on Computer and communications security. 2011. P. 125–138.
Golle P. Machine learning attacks against the Asirra CAPTCHA. Proceedings of the 15th ACM conference on Computer and communications security. 2008. P. 535–542.
Sakkatos P. et al. Analysis of text-based CAPTCHA images using Template Matching Correlation technique. The 4th Joint International Conference on Information and Communication Technology, Electronic and Electrical Engineering (JICTEE). IEEE. 2014. P. 1–5.
Ye G. et al. Yet another text CAPTCHA solver: A generative adversarial network based approach. Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 2018. P. 332–348.
Parfenov A.S., Sychev P.P. Raspoznavanie CAPTCHA na osnove generativno-sostyazatel'noj seti // Sistemnyj analiz v nauke i obrazovanii: setevoe nauchnoe izdanie. 2020. № 2. S. 40–48. URL: http://sanse.ru/download/389 (In Russian).
Li C. et al. End-to-end attack on text-based CAPTCHAs based on cycle-consistent generative adversarial network // Neurocomputing. 2021. V. 433. P. 223–236.
Kushchuk D.O. i dr. Ispol'zovanie domenno-sostyazatel'nogo obucheniya dlya raspoznavaniya tekstovyh kapch // Trudy Instituta sistemnogo programmirovaniya RAN. 2020. T. 32. № 4 (In Russian).
Stark F. et al. CAPTCHA recognition with active deep learning. Workshop new challenges in neural computation. 2015. V. 2015. P. 94.
Shi B., Bai X., Yao C. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE transactions on pattern analysis and machine intelligence. 2016. V. 39. № 11. P. 2298–2304.
Hussain R., Gao H., Shaikh R.A. Segmentation of connected characters in text-based CAPTCHAs for intelligent character recognition. Multimedia Tools and Applications. 2017. V. 76. № 24. P. 25547–25561.
Thobhani A. et al. CAPTCHA Recognition Using Deep Learning with Attached Binary Images. Electronics. 2020. V. 9. № 9. P. 1522.
Han S., Mao H., Dally W.J. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149. 2015.
Frankle J., Carbin M. The lottery ticket hypothesis: Finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635. 2018.
Hinton G., Vinyals O., Dean J. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531. 2015.
Tang R. et al. Distilling task-specific knowledge from bert into simple neural networks. arXiv preprint arXiv:1903.12136. 2019.
Graves A. et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. Proceedings of the 23rd international conference on Machine learning. 2006. С. 369–376.
Sanh V. et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108. 2019.
Srivastava N. et al. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research. 2014. V. 15. № 1. P. 1929–1958.
Goyal P. et al. Accurate, large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677. 2017.
Geoffrey Hinton, Nitish Srivastava, Kevin Swersky. Neural networks for machine learning lecture 6a overview of mini-batch gradient descent. 2012. P. 14.
Zhong Z., Jin L., Xie Z. High performance offline handwritten chinese character recognition using googlenet and directional feature maps. 2015 13th International Conference on Document Analysis and Recognition (ICDAR). IEEE. 2015. P. 846–850.
Glorot X., Bordes A., Bengio Y. Deep sparse rectifier neural networks. Proceedings of the fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings. 2011. P. 315–323.
Graves A., Schmidhuber J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural networks. 2005. V. 18. № 5–6. P. 602–610.
Sandler M. et al. Mobilenetv2: Inverted residuals and linear bottlenecks. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. P. 4510–4520.
Ma N. et al. Shufflenet v2: Practical guidelines for efficient cnn architecture design. Proceedings of the European conference on computer vision (ECCV). 2018. P. 116–131.
Ba L.J., Caruana R. Do deep nets really need to be deep? arXiv preprint arXiv:1312.6184. 2013.
Watkins C.J.C.H., Dayan P. Q-learning. Machine learning. 1992. V. 8. № 3–4. P. 279–292.
Van Hasselt H., Guez A., Silver D. Deep reinforcement learning with double q-learning. Proceedings of the AAAI Conference on Artificial Intelligence. 2016. V. 30. № 1.

Date of receipt: 3.08.2021

Approved after review: 17.08.2021

Accepted for publication: 27.08.2021