Journal Nonlinear World №3 for 2025 г.
Article in number:
Document recognition in various conditions using machine learning methods
Type of article: scientific article
DOI: https://doi.org/10.18127/j20700970-202503-06
UDC: 004.93
Authors:

E.P. Dogadina1, U.Yu. Sukhanova2, M.A. Ishchenko3, D.I. Veselov4

1–4 Financial University under the Government of the Russian Federation (Moscow, Russia)
1 epdogadina@fa.ru, 2 uysukhanova@fa.ru, 3 maishchenko@fa.ru, 4 diveselov@fa.ru

Abstract:

Document recognition is the process of automatically analyzing, classifying, and extracting information from structured and unstructured materials. Modern machine learning and deep learning technologies can significantly improve the accuracy and speed of document recognition compared to traditional approaches. However, real conditions of application of such systems are often complicated by many factors: low scanning quality, influence of external conditions (lighting, angle, paper deformation), use of specialized symbols. These difficulties require development of more flexible and adaptive methods of machine learning, capable of working in a wide range of conditions.

Target – the purpose of this work is to create a system capable of automatically analyzing and recognizing documents in various conditions using hybrid systems, namely a combination of YOLOv10 with text recognition tools such as Tesseract, KerasOCR and EasyOCR.

The results showed that the combination of YOLOv10 extra large with EasyOCR, Tesseract, KerasOCR provides sufficient recognition accuracy on documents in various conditions. At the same time, high KerasOCR metrics are associated with additional tuning of the model. Thus, the use of YOLOv10 in combination with modern text recognition tools allows you to create a universal system for analyzing documents in a wide range of conditions.

The developed system will reduce the amount of manual labor required to process documents, thereby freeing up employees to perform more complex and valuable tasks. In addition, the use of machine learning and text recognition algorithms will reduce the likelihood of errors associated with the human factor, increase the speed of data processing and improve the accuracy and quality of data. This system has a huge potential for application in a variety of areas where fast and high-quality document processing is required. It is of interest to accounting and finance, the legal sphere, healthcare and government agencies, and logistics.

Pages: 45-53
For citation

Dogadina E.P., Sukhanova U.Yu., Ishchenko M.A., Veselov D.I. Document recognition in various conditions using machine learning methods. Nonlinear World. 2025. V. 23. № 3. P. 45–53. DOI: https:// doi.org/10.18127/ j20700970-202503-06
(In Russian)

References
  1.  Rajzberg B.A., Lozovskij L.Sh., Starodubceva E.B. Sovremennyj ekonomicheskij slovar'. Izd. 2-e., ispr. M.: INFRA-M. 479 s. 1999 (In Russian).
  2. Dobler D.W., Burt D.N. Purchasing and Supply Management, Text and Cases (Sixth ed.). Singapore: McGraw-Hill. 1996. P. 70.
  3. Sirotkin S.A., Kel'chevskaya N.R. Buhgalterskij uchet i analiz. M.: INFRA-M. 2019. S. 224. 355 s. (In Russian).
  4. Tishina L.V. Razrabotka modulya intellektual'nogo raspoznavaniya dokumentov sredstvami mashinnogo zreniya. Interekspo Geo-Sibir'. 2022. V. 7. № 2. P. 136–140 (In Russian).
  5. Andriyanov N.A., Andriyanov D.A. O vazhnosti augmentacii dannyh pri mashinnom obuchenii v zadachah obrabotki izobrazhenij v usloviyah deficita dannyh. Informacionnye tekhnologii i nanotekhnologii (ITNT-2020): Sb. trudov po materialam VI Mezhdunar. konf. i molodezhnoj shkoly. V 4-h tomah. Samara, 26–29 maya 2020 goda. Pod red. V.V. Myasnikova. Tom 2. Samara: Samarskij nacional'nyj issledovatel'skij universitet im. akad. S.P. Koroleva. 2020. S. 383–388 (In Russian).
  6. Andriyanov N.A., Nikitin P.V. Postroenie i ocenka modelej mashinnogo obucheniya: Ucheb. posobie po discipline «Postroenie i ocenka modelej mashinnogo obucheniya» dlya studentov, obuchayushchihsya po napravleniyu «Prikladnaya matematika i informatika» vsekh profilej (programmy podgotovki magistrov). Finuniversitet, Departament analiza dannyh i mashinnogo obucheniya Fakul'teta informacionnyh tekhnologij i analiza bol'shih dannyh. M.: Finansovyj universitet. 2023. 1 fajl (6,31 Mb) (In Russian).
  7. Zhao X., Xu M., Silamu W., Li Y. CLIP-Llama: A New Approach for Scene Text Recognition with a Pre-Trained Vision-Language Model and a Pre-Trained Language Model. Sensors 2024. V. 24. P. 7371. https://doi.org/10.3390/s24227371.
  8. Nikolaev K., Malafeev A. Russian Q&A method study: From Naive Bayes to convolutional neural networks. Lecture Notes in Computer Science. 2018. V. 11179 LNCS. P. 121–126. DOI 10.1007/978-3-030-11027-7_12. EDN XWJVLT.
  9. Enweiji M.Z., Lehinevych T., Glybovets А. Cross-language text classification with convolutional neural networks from scratch. Eureka: Physics and Engineering. 2017. № 2. P. 24–33. DOI 10.21303/2461-4262.2017.00304. EDN YPOYJD.
  10. Gallego N. P. D., Ilao J., Cordel M. Blind first-order perspective distortion correction using parallel convolutional neural networks. Sensors. 2020. V. 20. № 17. P. 1–20. DOI 10.3390/s20174898. EDN BEKSSY.
  11. Ko K., Jang I., Choi J.H. et al. Stochastic decision fusion of convolutional neural networks for tomato ripeness detection in agricultural sorting systems. Sensors. 2021. V. 21. № 3. P. 1–14. DOI 10.3390/s21030917. EDN DHJZNO.
  12. Lozhkin A.G., Maiorov K.N., Bozek P. Convolutional neural networks training for autonomous robotics. Management Systems in Production Engineering. 2020. V. 29. № 1. P. 75–79. DOI 10.2478/mspe-2021-0010. EDN TYCRQO.
  13. Andriyanov N. Methods for preventing visual attacks in convolutional neural networks based on data discard and dimensionality reduction. Applied Sciences (Switzerland). 2021. V. 11. № 11. DOI 10.3390/app11115235. EDN PAAONS.
  14. Bratchenko I.A., Bratchenko L.A., Khristoforova Y.A. et al. Classification of skin cancer using convolutional neural networks analysis of Raman spectra. Computer Methods and Programs in Biomedicine. 2022. V. 219. P. 106755. DOI 10.1016/j.cmpb. 2022.106755. EDN BCGDFW.
  15. Andriyanov N.A., Dement'ev V.E., Tashlinskij A.G. Obnaruzhenie ob"ektov na izobrazhenii: ot kriteriev Bajesa i Nejmana – Pirsona k detektoram na baze nejronnyh setej. Komp'yuternaya optika. 2022. T. 46. № 1. S. 139–159. DOI 10.18287/2412-6179-CO-922 (In Russian).
  16. Ayachi R., Afif M., Said Ya. et al. Integrating Recurrent Neural Networks with Convolutional Neural Networks for Enhanced Traffic Light Detection and Tracking. Traitement du Signal. 2023. V. 40. № 6. P. 2577-2586. DOI 10.18280/ts.400620. EDN AWPIQZ.
  17. Fine-tuning OCR. URL: https://keras-ocr.readthedocs.io/en/latest/examples/fine_tuning_recognizer.html, data obrashcheniya: 19.03.2025. 
Date of receipt: 04.06.2025
Approved after review: 16.06.2025
Accepted for publication: 30.06.2025
Download