A.P. Ryzhkov - Ph.D. (Eng.), Academy of the Federal security service of Russia, Orel. E-mail: Pan-zerT35@yandex.ru
D.A. Novikov - Academy of the Federal security service of Russia, Orel. E-mail: PanzerT35@yandex.ru
When low-speed coding of speech signals (< 16 kbit/s) plays an important role in the procedure of vector quantization (VK)used in speech codecs based on the method of linear prediction. Procedure VK in the technique of encoding speech is associated, generally, with the conversion of two model parameters of the speech signal synthesis: vector of the excitation signal and the vector of linear prediction coefficients (or other factors shaping the model, for example, linear spectral frequency). Most qualitative characteristics according to the criterion of mean square error provides the VC with a complete search, but significant costs on the computational complexity and the memory capacity are forced to use the fast search algorithms in some deterioration of the characteristics. Analysis of cost reduction on the computational complexity and the storage capacity of the device shows the potential use of cascading VK and its variants with the application of new technologies building (training) of code books.
The solution VK invited to perform on the basis of the technology of artificial neural networks. In the context of neural networks possible application of neural networks radial basis functions, RBF, since we know that each set of images randomly placed in the multidimensional space is separable provided high dimensionality of this space. Justification for the use of RBF type networks compared to similar functioning (multilayer perceptrons, recurrent, self-organization) in the first place is the high learning rate, i.e. the creation of code books in vector quantization.
The analysis considered methods of training neural networks based on radial basis functions and simulation results shows that the best for learning is a hybrid method, including configuration procedures on the basis of self-organization and learning with a teacher".
Upon completion of the first stage of the training sample vectors linear spectral frequency space of hidden neurons of a three-layer RBF network was equal 145-157 for a series of repeated experiments. When forming the training samples of the signal error of the linear prediction of the number of hidden neurons for the three-layer RBF network was defined as 1660-1690. When forming the code books for linear spectral frequencies and vectors of the excitation signal at the first stage is selected 0,005 necessary for good statistical accuracy at the stage of convergence. Formed space cell vectors is close from the point of view of the placement of the reference vectors of the centroids in the N-dimensional coordinate system. As the exact mechanism of adjustment you need to make the quantization vectors. Constant training for the formation of the code books linear spectral frequencies are selected as the initial value of 0.09. The procedure of quantization after several passes over the input data coordinates of the reference vectors Voronoi cease to change, and, hence, completes the creation of an area of the Voronoi polygons. It is found experimentally that for the formation of the code books vector signal excitation constant training to 0.07.
The features of the algorithm is:
1) the cost Function is convex with respect to the linear parameter , however, is not convex with respect to the centers and the matrix .
2) For the second phase you can use different settings for speed training .
3) unlike back-propagation algorithm, the training phase with the teacher, which is a gradient descent procedure for the RBF network, does not involve back-propagation of the error signal.
When the search is used mnogostupenchataya hierarchical procedure that accelerates the speed of the search in contrast to tree-search vector-centroid. Multistage hierarchical VK shares common lookup on the many sub-operations, each of which re-quires a small amount of calculations. In each sub-operation is processed the rest of the vector generated in the previous podate. Input vector quantuum ki-stage vector quantizer, the remainder (error) quantization served on the input of the second kj-stage vector quantizer. The process can be repeated for any number of sub-steps. The final quantized value vector for code books are in the form of the sum of the output vectors of intermediate and final quantizer.
Testing of the developed algorithms are executed on a real PC with a duration not less than 15 minutes for 13 speakers. Ap-plication of neural networks reduces the algorithmic delay of 20-25% in comparison with the known solutions in this area. Validation of the developed algorithms according to the requirements for them, showed them the match and the possibility of further implementation on modern element base with modernization and the development of new low-speed receptionroom devices promising complexes processing and transmission of the speech signal.
Makkhoul D., Rukos S., Gish G.
Vektornoe kvantovanie pri kodirovanii rechi // TIIEHR. 1985. T. 73. № 11. S. 19-61.
Osovskijj S. Nejjronnye seti dlja obrabotki informacii.
M.: Finansy i statistika, 2002.
Cover T.M. Nearest neighbor pattern
classification // IEEE Transactions on Information Theory. 1967. V. IT-13. R.
Cover T.M. Geometrical and
statistical properties of systems of linear inequalities with applications in
pattern recognition // IEEE Transactions on Electronic Computers. 1965. V. EC-14.