A.V. Korennoj1, D.S. Yudakov2, A.P. Chernyshov3, S. Alshavva4
1–4 Air Force Academy named after Professor N.E. Zhukovsky and Y.A. Gagarin (Voronezh, Russia)
1 korennoj@mail.ru, 2 yds12345@rambler.ru, 3 cherntol19@yandex.ru, 4 ashawasafwan7@gmail.com
One of the most important stages of preprocessing a speech signal during automatic identification of a subscriber in radio networks is its segmentation into separate sections containing speech and pauses. Most segmentation algorithms operate under pure signal conditions or at very large signal-to-noise ratios. The paper proposes an algorithm for determining pauses in speech signals, the essence of which is to use differences in the correlation (energy) properties of the speech signal and noise. The algorithm includes measuring the values of the autocorrelation function of the input signal in the current time, analyzing these values, and deciding whether to detect (not detect) a pause. An expression is obtained for the threshold value at which a guaranteed correct decision on signal detection is possible. The dependence between the minimum signal-to-noise ratio and the value of the averaging interval is determined. The proposed algorithm allows segmentation of the speech signal into speech/pause sections at low signal-to-noise ratios, which is confirmed by experimental studies.
Korennoj A.V., Yudakov D.S., Chernyshov A.P., Alshavva S. Noise-resistant algorithm of speech signal segmentation in subscriber identification systems. Electromagnetic waves and electronic systems. 2023. V. 28. № 5. P. 15−23. DOI: https://doi.org/10.18127/ j15604128-202305-02 (in Russian)
- Roldugin S.V., Golubinsky A.N., Volskaya T.A. Models of speech signals for identification of a person by voice. Radio engineering. 2002. № 11. P. 79–81. (in Russian)
- Farkhadov M.P., Vaskovsky S.V. Speech recognition systems in departmental networks. Electromagnetic waves and electronic systems. 2019. V. 24. № 5. P. 25–31. DOI 10.18127/j15604128-201905-04. (in Russian)
- Atal B., Rabiner L.R. A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing. 1976. V. 24. №. 3. P. 201–212. DOI 10.1109/TASSP.1976. 1162800.
- Childers D.G., Hahn M., Larar J.N. Silent and voiced/unvoiced/mixed excitation (four-way) classification of speech. IEEE Transactions on Acoustics, Speech, and Signal Processing. 1989. V. 37. № 11. P. 1771–1774. DOI 10.1109/29.46561.
- Greenwood M., Kinghorn A. SUVing: Automatic Silence / Unvoiced / Voiced Classification of Speech: Undergraduate Coursework. Department of Computer Science. The University of Sheffield. UK. 1999. 4 p.
- Lyalin S.G. The method of noise reduction in speech signals using a neural network. Advanced Science. 2019. № 2(13). P. 32–38. DOI 10.25730/VSU.0536.19.021 (in Russian)
- Alimuradov A.K., Tychkov A.Yu., Churakov P.P., Ageikin A.V., Kuleshov A.P., Chernov I.A. Speech/pause segmentation algorithm based on the decomposition into empirical modes and one-dimensional Mahalanobis distance. Proceedings of the Moscow Institute of Physics and Technology (National Research University).2021. V. 13. № 3(51). P. 4–22. DOI 10.53815/20726759_2021_13_3_4. (in Russian)
- Radzievsky V.G., Trifonov P.A. Processing of ultra-wideband signals and interference. M.: Radio Engineering. 2009. 286 p. ISBN 978-5-88070-231-2. (in Russian)
- Trifonov A.P., Shinakov Yu.S. Joint differentiation of signals and evaluation of their parameters against the background of interference. M.: Radio and communications. 1986. 264 p. (in Russian)
- Sheikin R.L. To the analysis of the mechanisms of occurrence of pauses in speech. Mechanisms of speech formation and perception of complex sounds. M., L.: Nauka. 1966. P. 31–44. (in Russian)