Real-time voice conversion using artificial neural network with rectified linear units


I. S. Azarov – Ph.D. (Eng.), Belarusian State University of Informatics and Radioelectronics.
М. I. Vashkevich – Assistant, Belarusian State University of Informatics and Radioelectronics.
А. А. Petrovsky – Dr.Sc. (Eng.), Professor, Belarusian State University of Informatics and Radioelectronics.

The paper presents a voice conversion technique that can be used in real-time applications. The technique is based on transformation of short-time spectral envelopes of speech using artificial neural network with Rectified Linear Units. A special network configuration is used that takes into account temporary speaker states. Speech is represented as instantaneous parameters of the harmonic + noise model. The proposed voice conversion technique is compared to main alternative techniques using objective and subjective measures.

May 29, 2020

