N.A. Mironov – Lecturer, Department «Mathematics, Informatics and Information Technologies», Nizhniy Novgorod academy of Ministry of Interior of Russia
Currently, the task of separating a voice message from interference is solved using microphone arrays. Modeling such tasks often ends with the introduction of a large number of restrictions that do not correspond to real conditions, for example, the reverberation of sound in a room is not taken into account, the issue of identifying «quiet» signals is not considered, and issues of highlighting the voice of a moving speaker are not considered. In this article, the stability of the speech signal spatial processing algorithm with a microphone array to work in real conditions is investigated. The algorithm consists in introducing time delays, depending on spatial coordinates, into each signal processing channel and further determining the optimal weight vector of the microphone array. The microphone array consists of 20 omnidirectional microphones placed at a height of 170 centimeters. The experiment is conducted in a closed room with an area of 24 square meters. With the help of the proposed algorithm, one voice message is selected from a mixture of voices of eight different people. In this paper, audio recordings of voices of different people are used as voice messages. To study the effect of reverberation on the SNR, a simplified model is proposed: the signal of the microphone, is presented as a superposition of the signals of one microphone shifted in time and reduced in amplitude without additional effects. It is established that the algorithm of spatial processing of voice messages is resistant to the sound reverberation effect. The minimum possible level of the useful signal has been determined in relation to the level of signals from sources of interferences (-20.5 dB), which maintains a satisfactory level of intelligibility (87%). This indicates the possibility of singling out a «quiet» voice against the background of a loud conversation. When using information about the trajectory of the speaker from independent positioning channels, it was possible to isolate the voice of the moving source from the interference. When moving one speaker in a straight line among the seven stationary sources of interference, the level of verbal intelligibility of the selected message was 93,23%.
- Stolbov M.B. Primenenie mikrofonnykh reshetok dlya distantsionnogo sbora rechevoi informatsii. Nauchno-tekhnicheskii vestnik informatsionnykh tekhnologii, mekhaniki i optiki. 2015. T. 15. № 4. S. 661−675.
- Kanakov V.A., Mironov N.A. Prostranstvennaya obrabotka shirokopolosnykh signalov na primere rechevykh soobshchenii. Izvestiya VUZov. Radiofizika. 2018. T. 61. № 1. S. 85−91.
- GOST R 51061-97. Sistemy nizkoskorostnoi peredachi po tsifrovym kanalam. M. 1997. 24 s.
- Kanakov V.A., Mironov N.A. O vydelenii akusticheskogo signala na fone intensivnykh prostranstvenno-raspredelennykh pomekh. Sovremennoe sostoyanie estestvennykh i tekhnicheskikh nauk. 2014. T. 17. S. 8-11.
- URL = https://samesound.ru/p/mixing/94818-reverberation-starter-guide (data obrashcheniya 22.03.2019).
- Pokrovskii N.B. Raschet i izmerenie razborchivosti rechi. M.: Gos. izd-vo lit. po voprosam svyazi i radio. 1962. 391 s.
- Aldoshina I.A. Osnovy psikhoakustiki. Ch. 12. Gromkost slozhnykh zvukov. Zvukorezhisser. 2000. № 9. S. 66-72. URL = http://digitalmusicacademy.ru/sites/default/files/content/aldoshina-psihoakustika.pdf (data obrashcheniya 22.03.2019).
- Kanakov V.A., Mironov N.A. Vydelenie rechevogo soobshcheniya iz pomekh, vnosimykh storonnimi raspredelennymi istochnikami. Izvestiya vuzov. Radiofizika. 2017. T. 60. № 3. S. 281-287.