350 rub
Journal Biomedical Radioelectronics №2 for 2025 г.
Article in number:
«Signal/pause» segmentation technology based on the analysis of the mixing level of speech signal fragments
Type of article: scientific article
DOI: https://doi.org/10.18127/j15604136-202502-06
UDC: 615.47:616-072.7
Authors:

A.K. Alimuradov1, A.Y. Tychkov2, O.S. Simakov3, A.A. Mamonova4, Z.M. Yuldashev5, J.A. Temirova6

1–4 Penza State University (Penza, Russia)
5 St. Petersburg State Electrotechnical University "LETI" (Saint Petersburg, Russia)
6 St. Petersburg State Pediatric Medical University (Saint Petersburg, Russia)
1 tychkov-a@mail.ru, 2 alansapfir@yandex.ru, 3 zcsio@mail.ru, 4 mamonova.02@yandex.ru, 5 yuld@mail.ru, 6 temirova.2013@list.ru

Abstract:

«Signal/pause» segmentation is a key task in speech signal processing, which consists in determining the exact boundaries between speech and pauses. The influence of such a factor as background noise significantly complicates this process, since it can distort the true boundaries of speech segments and pauses. It is necessary to develop a reliable segmentation technology that ensures high reliability of speech segment detection in the presence of background noise.

Work purpose – development and study of «signal/pause» segmentation technology that allows for effective differentiation of the levels of mixing of speech signal fragments and reliable detection of the boundaries of speech segments and pauses.
The study of the technology yielded results demonstrating a high level of reliability in determining the boundaries of speech and pauses. The best segmentation results with errors of 1.8% and 0.9% are achieved when compared with the values of the average level of fragment mixing and the median of the first 20 fragments corresponding to the initial pause with background noise.
The proposed «signal/pause» segmentation technology has sufficient practical value, since its application allows for a significant increase in real-time reliability and a decrease in computational load. This is especially important for speech applications that provide human-computer interactions via voice interfaces.

Pages: 38-43
For citation

Alimuradov A.K., Tychkov A.Y., Simakova O.S., Mamonova A.A., Yuldashev Z.M., Temirova D.A. «Signal/pause» segmentation technology based on the analysis of the mixing level of speech signal fragments. Biomedicine Radioengineering. 2025. V. 28. № 2.
P. 38–43. DOI: https:// doi.org/10.18127/j15604136-202502-06 (In Russian)

References
  1. Zilinberg A.Yu., Korneev Yu.A., Tomchuk K.K. Analiz harakteristik impul'snyh pomekh v trakte peredachi rechevyh signalov. Sbornik dokladov Nauchnoj sessii GUAP. SPb.: GUAP. 2011. Ch. 2. S. 19–20 (In Russian).
  2. Amir N., Kerret O., Karlinski D. Classifying emotions in speech: a comparison of methods. Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001). P. 127–130.
  3. Amirgaliyev Y., Hahn M., Mussabayev T. The speech signal segmentation algorithm usingpitch synchronous analysis. Open Computer Science. 2017. V. 7. № 1. 8 p.
  4. Alimuradov A.K., Tychkov A.Yu., Churakov P.P. Novyj podhod k segmentacii rechevyh signalov na osnove dekompozicii na empiricheskie mody dlya ocenki psihoemocional'nogo sostoyaniya cheloveka. Perspektivnye informacionnye tekhnologii (PIT 2019) [Elektronnyj resurs]: Trudy Mezhdunarodnoj nauchno-tekhnicheskoj konferencii. Samara: Izd-vo Samarskogo nauch. centra RAN. 2019. S. 366–369 (In Russian).
  5. Alimuradov A.K. Povyshenie effektivnosti segmentacii rechevyh signalov na osnove energeticheskogo operatora Tigera. Izmereniya. Monitoring. Upravlenie. Kontrol'. 2021. № 3. S. 80–92 (In Russian).
  6. Abdolali B., Sameti H. Method for speech segmentation based on speakers’ characteristics. Signal & Image Processing: An International Journal (SIPIJ). 2012. V. 3. № 2. P. 65–78.
  7. Lu Z., Liu B., Shen L. Speech Endpoint Detection in Strong Noisy Environment Based on the Hilbert-Huang Transform. Proceedings of the 2009 IEEE International Conference on Mechatronics and Automation. 2009. 12 p.
Date of receipt: 17.01.2025
Approved after review: 24.02.2025
Accepted for publication: 06.03.2025