Noise-resistant algorithm of speech signal segmentation in subscriber identification systems

350 rub

Journal Electromagnetic Waves and Electronic Systems №5 for 2023 г.

Article in number:

Type of article: scientific article

DOI: https://doi.org/10.18127/j5604128-202305-02

UDC: 004.093:57.087.1

Keywords: Segmentation of the speech signal speech/pause sections correlation function spectral density variance detection threshold

Authors:

A.V. Korennoj1, D.S. Yudakov2, A.P. Chernyshov3, S. Alshavva4

1–4 Air Force Academy named after Professor N.E. Zhukovsky and Y.A. Gagarin (Voronezh, Russia)

1 korennoj@mail.ru, 2 yds12345@rambler.ru, 3 cherntol19@yandex.ru, 4 ashawasafwan7@gmail.com

Abstract:

One of the most important stages of preprocessing a speech signal during automatic identification of a subscriber in radio networks is its segmentation into separate sections containing speech and pauses. Most segmentation algorithms operate under pure signal conditions or at very large signal-to-noise ratios. The paper proposes an algorithm for determining pauses in speech signals, the essence of which is to use differences in the correlation (energy) properties of the speech signal and noise. The algorithm includes measuring the values of the autocorrelation function of the input signal in the current time, analyzing these values, and deciding whether to detect (not detect) a pause. An expression is obtained for the threshold value at which a guaranteed correct decision on signal detection is possible. The dependence between the minimum signal-to-noise ratio and the value of the averaging interval is determined. The proposed algorithm allows segmentation of the speech signal into speech/pause sections at low signal-to-noise ratios, which is confirmed by experimental studies.

Pages: 15-23

Korennoj A.V., Yudakov D.S., Chernyshov A.P., Alshavva S. Noise-resistant algorithm of speech signal segmentation in subscriber identification systems. Electromagnetic waves and electronic systems. 2023. V. 28. № 5. P. 15−23. DOI: https://doi.org/10.18127/ j15604128-202305-02 (in Russian)

References

Roldugin S.V., Golubinsky A.N., Volskaya T.A. Models of speech signals for identification of a person by voice. Radio engineering. 2002. № 11. P. 79–81. (in Russian)
Farkhadov M.P., Vaskovsky S.V. Speech recognition systems in departmental networks. Electromagnetic waves and electronic systems. 2019. V. 24. № 5. P. 25–31. DOI 10.18127/j15604128-201905-04. (in Russian)
Atal B., Rabiner L.R. A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing. 1976. V. 24. №. 3. P. 201–212. DOI 10.1109/TASSP.1976. 1162800.
Childers D.G., Hahn M., Larar J.N. Silent and voiced/unvoiced/mixed excitation (four-way) classification of speech. IEEE Transactions on Acoustics, Speech, and Signal Processing. 1989. V. 37. № 11. P. 1771–1774. DOI 10.1109/29.46561.
Greenwood M., Kinghorn A. SUVing: Automatic Silence / Unvoiced / Voiced Classification of Speech: Undergraduate Coursework. Department of Computer Science. The University of Sheffield. UK. 1999. 4 p.
Lyalin S.G. The method of noise reduction in speech signals using a neural network. Advanced Science. 2019. № 2(13). P. 32–38. DOI 10.25730/VSU.0536.19.021 (in Russian)
Alimuradov A.K., Tychkov A.Yu., Churakov P.P., Ageikin A.V., Kuleshov A.P., Chernov I.A. Speech/pause segmentation algorithm based on the decomposition into empirical modes and one-dimensional Mahalanobis distance. Proceedings of the Moscow Institute of Physics and Technology (National Research University).2021. V. 13. № 3(51). P. 4–22. DOI 10.53815/20726759_2021_13_3_4. (in Russian)
Radzievsky V.G., Trifonov P.A. Processing of ultra-wideband signals and interference. M.: Radio Engineering. 2009. 286 p. ISBN 978-5-88070-231-2. (in Russian)
Trifonov A.P., Shinakov Yu.S. Joint differentiation of signals and evaluation of their parameters against the background of interference. M.: Radio and communications. 1986. 264 p. (in Russian)
Sheikin R.L. To the analysis of the mechanisms of occurrence of pauses in speech. Mechanisms of speech formation and perception of complex sounds. M., L.: Nauka. 1966. P. 31–44. (in Russian)

Date of receipt: 09.08.2023

Approved after review: 31.08.2023

Accepted for publication: 26.09.2023