350 rub
Journal Science Intensive Technologies №11 for 2009 г.
Article in number:
ROBUST METHODS OF PROCESSING OF A SPEECH SIGNAL IN FREQUENCY DOMAIN
Authors:
A.S. Kolokolov, V.M. Krol, A.Yu. Mestcherakov, I.A. Lubinski, V.P. Yachno
Abstract:
Methods of a speech signal spectrum processing based on a bandpass filtering of a speech signal logarithmic amplitude spectrum are offered. At the heart of these methods modern representations about mechanisms of processing of acoustic stimulus in the hearing analyzer have been used. These hearing mechanisms are realized on the base of lateral and delayed inhibition processes in the nerves system. By means of the developed methods some local cues in the logarithmic amplitude spectrum of speech are selected. The selected cues are spectral picks, sharp slopes of the spectrum in frequency and abrupt time spectrum variations. The proposed methods application allows to make more stable the time-frequency description of a speech signal at change of intensity and presence of frequency distortions which usually arise owing to change of a microphone, reverberation for the account of reflection from walls of a room, variations of the form of impulses of a vocal source at changes of a psycho-physiological condition of the speaker, etc. Moreover the proposed spectrum processing based on a bandpass filtration of a logarithmic spectrum on frequency by the filter with the even pulse characteristic raises stability to background noise. It is possible because such kind processing selects the fragments of a spectrum connected with resonances of a vocal tract, where the relation the signal to noise are usually the largest. Effectiveness of the methods was demonstrated by means of its digital realizations on natural examples of speech signals. The received results confirmed the expediency of their use in speech recognition systems for increase of their stability to external acoustic factors and influences on a pronunciation of changes of a psycho-physiological state of the speaker.
Pages: 63-70
References
  1. Колоколов А.С., Кроль В.М., Любинский И.А., Мещеряков А.Ю., Яхно В.П. Обработка спектра сигнала в слуховом анализаторе // Наукоёмкие технологии. 2009. №7. С. 42-47.
  2. Любинский И.А., Позин Н.В., Яхно В.П. Анализ моделей однородного нейронного слоя с латеральными связями // АиТ. 1967. №10. С.168 - 181.
  3. Фант Г. Акустическая теория речеобразования. М.: Наука. 1964.
  4. Чистович Л.А., Венцов А.В., Гранстрем М.П. и др. Физиология речи. Восприятие речи человеком. / В серии «Руководство по физиологии». Л.: Наука. 1976.
  5. Traunmüller H. Analytical expressions for the tonotopic sensory scale // J. Acoust. Soc. Amer. 1990. V. 88. No.1. P.97 -100.
  6. Zwicker E., Terhardt E. Analytical expressions for critical-band rate and critical bandwidth as a function of frequency // J. Acoust. Soc. Amer. 1980. V.68. No.5. P.1523-1525.
  7. Плотников В.Н., Суханов В.А., Жигульцев Ю.Н. Речевой диалог в системах управления. М.: Машиностроение. 1988.