Development of a method for evaluating the accuracy of audio signal recognition using a neural network for large amounts of data

350 rub

Journal Dynamics of Complex Systems - XXI century №2 for 2020 г.

Article in number:

DOI: 004.032.26

UDC: 10.18127/j19997493-202002-07

Keywords: Neural networks big data quality metrics Lowenstein distance speech recognition phonemes

Authors:

B.S. Goryachkin − Ph.D. (Eng.), Associate Professor,

Department of Information Processing and Management Systems, Bauman Moscow State Technical University

E-mail: bsgor@mail.ru

B.I. Bagaviev − Under-graduate Student,

Department of Information Processing and Management Systems, Bauman Moscow State Technical University E-mail: buba1219@yandex.ru

Abstract:

The article is devoted to human speech recognition to facilitate entering information into a computer using voice data input. The advantages and disadvantages of the developed method of human speech recognition in comparison with the classical method of typing on the keyboard are shown. A speech recognition algorithm implemented by output of data to a console or text file is presented. The developed speech recognition module uses a neural network as a tool. This procedure was evaluated using a standard metric developed during the research. Based on the analysis of the developed metric for evaluating the quality of converted data, its effectiveness is shown, especially for large data volumes. The developed speech recognition module can be used both for entering data on the computer and for calling system commands of the operating system.

Pages: 63-70

References

Raspoznavanie rechi. Nacional'naja biblioteka im. N. Je. Baumana. Bauman National Library [Jelektronnyj resurs] − Rezhim dostupa: https://ru.bmstu.wiki/%D0%A0%D0%B0%D1%81%D0%BF%D0%BE%D0%B7%D0%BD%D0%B0%D0%B2%D0%B0%D0%BD%D0 %B8%D0%B5_%D1%80%D0%B5%D1%87%D0%B8 (In Russian).
Otkrytye problemy v oblasti raspoznavanija rechi. Jandeks [Jelektronnyj resurs] − Rezhim dostupa: https://habr.com/ru/company/yandex/bl Otkrytye problemy v oblasti raspoznavanija rechi. Jandeks og/337572/ (In Russian).
Dokumentacija Google Speech_To_Text API [Jelektronnyj resurs] − Rezhim dostupa: https://cloud.google.com/speech-to-text-hl=ru (In Russian).
Cybul'skij A.S. Ispol'zovanie mashinnogo obuchenija dlja raspoznavanija rechi. RJeU im. G.V. Plehanova. Mezhdunarodnyj studencheskij nauchnyj vestnik. 2017. № 6. 25 s. (In Russian).
Zadacha o redakcionnom rasstojanii, algoritm Vagnera-Fishera. Universitet ITMO [Jelektronnyj resurs] - Rezhim dostupa: https://neerc.ifmo.ru/wiki/index.php-title=%D0%97%D0%B0%D0%B (In Russian).
Titouan Parcollet, Mirco Ravanelli, Mohamed Morchid, Georges Linarès, Renato De Mori1. Speech Recognition with Quaternion Neural Networks. 2018. 35 с.
Karahtanov D.S. Programmnaja realizacija algoritma Levenshtejna dlja ustranenija opechatok v zapisjah. Molodoj uchenyj. 2010. № 8(19). T. 1. S. 158-162. / URL: https://moluch.ru/archive/19/1966/ (In Russian).
Sozdanie modeli nejronnoj seti glubokogo obuchenija s ispol'zovaniem Flask, Keras, TensorFlow в Python [Jelektronnyj resurs]. Rezhim dostupa: https://mc.ai/deploy-your-first-deep-learning-neural-network-model-using-flask-keras-tensorflow-in-python (In Russian).

Date of receipt: 5 мая 2020 г.