A.V. Korennoj1, S.М. Alshavva2, D.S. Yudakov3
1–3 Air Force Academy named after Professor N.E. Zhukovsky and Y.A. Gagarina, Voronezh, Russia)
1 korennoj@mail.ru, 2 yds12345@rambler.ru, 3 ashawasafwan7@gmail.com
When building voice recognition systems, one of the most important stages of the system is the extraction of informative features of the speech signal. In addition to the computational complexity, the performance of most known feature extraction methods degrades at low signal-to-noise ratios, which affects the accuracy of the formation of subscriber voice models and model matching, and therefore the accuracy of the recognition system. The proposed method is based on representing the spectral characteristics of the subscriber’s vocal tract filter (speech apparatus) by approximating coefficients of the discrete wavelet transform of the logarithm of the speech signal spectrum, which will allow the recognition system to operate effectively in conditions of low signal-to-noise ratios (until 3 dB) with low computing requirements.
Korennoj A.V., Alshavva S.М., Yudakov D.S. Extraction of voice features of speech signal based on discrete wavelet transform. Achievements of modern radioelectronics. 2024. V. 78. № 10. P. 10–16. DOI: https://doi.org/10.18127/j20700784-202410-02 [in Russian]
- Ravi P.R, Kevin R.F., Roopashri R., Richard J.M. Speaker recognition-general classifier approaches and data fusion methods. Pattern Recognition, Elsevier Science Ltd. 2002. V. 35. P. 2801–2821.
- Sahidullah M., Chakroborty S., Saha G. On the use of perceptual Line Spectral pairs Frequencies and higher-order residual moments for Speaker Identification. International. J. Biometrics. 2010. V. 2. № 4. P. 358–378.
- Nilu S. et al. MFCC and Prosodic Feature Extraction Techniques: A Comparative Study? International Journal of Computer Applications. Published by Foundation of Computer Science, New York, USA. Sept. 2012. V. 54(1). P. 9–13.
- Sud'enkova A.V. Obzor metodov izvlecheniya akusticheskikh priznakov rechi v zadache raspoznavaniya diktora. Sb. nauch. trudov NGTU. 2019. № 3-4. S. 139–164. [in Russian]
- Rabiner L.R., Schafer R.W. Digital processing of speech signal. New Jersey, Prentice-Hall, 1978 (Russ. ed.: Rabiner L.R., Shafer R.V. Tsifrovaya obrabotka rechevykh signalov. Moscow, Radio i svyaz' Publ., 1981)
- Rabiner L., Juang B.-H. Fundamentals of speech recognition. NJ: Prentice-Hall, Inc., 1993.
- Wang F., Xu W. A comparison of algorithms for the calculation of LPC coefficients. Proceedings of International Conference on Information Science, Electronics and Electrical Engineering, Sapporo, Japan. 2014. P. 300–302.
- Mallat S. A Theory for Multiresolution Signal Decomposition: the Wavelet Representation. IEEE Pattern Anal. And Machine Intel. 1989. V. 11. № 7. P. 674–693.
- Mallat S.G. A Wavelet Tour of Signal Processing. Academic Press. 1997.
- Goswami J.C., Chan A.K. Fundamentals of Wavelets Theory, Algorithms and Applications. John Wiley & Sons Ltd. 1999.