Radiotekhnika
Publishing house Radiotekhnika

"Publishing house Radiotekhnika":
scientific and technical literature.
Books and journals of publishing houses: IPRZHR, RS-PRESS, SCIENCE-PRESS


Тел.: +7 (495) 625-9241

 

Application of polygaussian probabilistic models for statistical analysis of multidimensional data for solving the problem of classification of marine objects using low frequency hydroacoustic noise-detection signals

DOI 10.18127/j19998554-201809-02

Keywords:

O.A. Andreev – Senior Research Scientist, JSC «SRI «Atoll»; Lecturer Assistant, Institute of System Analysis and Management, State University Dubna
E-mail: wert_of_mor@mail.ru
A.T. Trofimov – Chief Research Scientist, JSC «SRI «Atoll»; Dr. Sc. (Eng.), Professor, Institute of System Analysis and Management, State University Dubna
E-mail: att44@mail.ru


Estimation of the probability distribution law or the probability density function (PDF) is one of the main tasks of data analysis. The final result of the analysis is the possibility of visual representation of the estimates obtained during analysis. As a rule, there is a problem of obtaining and representing estimates for high-dimensional data, such as high definition video streams, time series or signal spectra. To solve the problem it is proposed to use polygaussian probabilistic models, also known as Gaussian mixtures. We use the model of the hierarchical polygaussian PDF (PPDF), which allows us to analyze arbitrary multidimensional heterogeneous non-Gaussian data. At each level of hierarchy, data is divided or combined into homogenous parts in a certain sense, described by their own Gaussian mixtures. The hierarchical level above combines the lower levels of hierarchy by obtaining sufficient statistics for them. For the lowest level of the hierarchy, the role of sufficient statistics is played by the data itself (trivial sufficient statistics). The values, obtained during the process of estimation of the parameters of the proposed model, can be used to estimate PDF, visualize data and solve various problems. The visual representation of the data is based on mapping Gaussian densities comprising the PPDF in the space of the two selected principle components of the analyzed data. Each Gaussian density is displayed using an ellipse, whose position and orientation are determined by mean and covariance matrix of the Gaussian density, and a column, located at the center of the ellipse and designating a priori probability. It can also be visualized by a data cluster that is homogeneous in the sense of proximity of the analyzed data to the realizations of the random Gaussian process described by the corresponding Gaussian density. The proposed model of the hierarchical PPDF and the approach for data visualization were used to solve the problem of classification of marine objects on the basis of experimentally obtained low-frequency noise-detection signals represented by the energy spectra of their realizations. The dimension of each spectrum exceeds 500 samples. Based on the results of division of the frequency range into segments the parameters of the hierarchical PPDF for the marine objects of each class were estimated. To form the second layer of the hierarchy, a posteriori probability of the data belonging to the original classes at each of the segments was used, which allowed to interpret the input of the second level of the hierarchy as a vector of intermediate classification solutions. The result of the estimation of the parameters of the hierarchical PPDF and their visualization, in particular, is the conclusion about strongly pronounced non-Gaussian character of the probabilistic distribution of the original data. According to the estimates of the hierarchical PPDF, a Bayesian neural network was synthesized, for which the probability of a correct classification of marine objects according to test experimental data exceeds 0,95. The network was synthesized and modeled in MATLAB by means of Neural Network Toolbox and exported to Simulink for implementation on the target computing device.

References:
  1. Bol'shakov A.A., Karimov R.N. Metody obrabotki mnogomernykh dannykh i vremennykh ryadov. Ucheb. posobie dlya vuzov. M.: Goryachaya liniya – Telekom. 2007.
  2. Medvedev V.S., Potyomkin V.G. Nejronnye seti. MATLAB 6 / Pod obshch. red. V.G. Potyomkina. M.: DIALOG-MIFI. 2001.
  3. Galushkin A.I. Nejronnye seti: osnovy teorii. M.: Goryachaya liniya – Telekom. 2012.
  4. Zinov'ev A.Yu. Vizualizatsiya mnogomernykh dannykh. Krasnoyarsk: KGTU. 2000. URL: http://www.ihes.fr/~zinovyev/papers/ZinovyevBook.pdf (data obrashcheniya: 30.01.2018).
  5. Chabdarov Sh.M., Trofimov A.T. Poligaussovy predstavleniya proizvol'nykh pomekh i priem diskretnykh signalov // Radiotekhnika i elektronika. 1975. T. 20. № 4. S. 734–745.
  6. Trofimov A.T. Otsenivanie meshayushchikh parametrov dlya adaptivnoj obrabotki signalov na osnove ispol'zovaniya poligaussovskoj modeli pomekh // Radiotekhnika i elektronika. 1986. T. 31. № 11. S. 2151–2159.
  7. Trofimov A.T. Poligaussovskie veroyatnostnye modeli i sintez informatsionnykh sistem. Novgorod: NovGU im. Yaroslava Mudrogo. 2002.
  8. Lei Xu, Michael I.J. On convergence properties of the EM algorithm for Gaussian mixtures // Neural Computation. 1996. № 8. P. 129–151. URL: http://lasa.epfl.ch/teaching/lectures/ML_Msc/Slides/OnConvergence.pdf (data obrashcheniya: 31.01.2018).
  9. Dinov I.D. Expectation maximization and mixing modeling tutorial [Elektronnyj resurs] / URL: https://escholarship.org/uc/item/1rb70972 (data obrashcheniya: 31.01.2018).
  10. Figueiredo M.A.T., Jain A.K. Unsupervised learning of finite mixture models // IEEE Trans. on Pattern Analysis and Machine Intelligence. 2002. V. 24. № 3. P. 381–396. URL: http://dataclustering.cse.msu.edu/papers/mixtureTPAMI.pdf (data obrashcheniya: 31.01.2018).
  11. Zhihua Ban, Jianguo Liu, Li Cao. Superpixel segmentation using Gaussian mixture model [Elektronnyj resurs] / URL: https://arXiv.org/abs/1612.08792v2 (data obrashcheniya: 31.01.2018).
  12. Zalivin A.N., Balabanova N.S. Obnaruzhenie dvizhushchikhsya ob"ektov metodom vychitaniya fona s ispol'zovaniem smesi gaussovykh raspredelenij [Elektronnyj resurs] / URL: https://elibrary.ru/item.asp?id=27360332 (data obrashcheniya: 28.01.2018).
  13. Rakhmanenko I.A. Programmnyj kompleks dlya identifikatsii diktora po golosu s primeneniem parallel'nykh vychislenij na tsentral'nom i graficheskom protsessorakh [Elektronnyj resurs] / URL: https://elibrary.ru/item.asp?id= 29737712 (data obrashcheniya: 28.01.2018).
  14. Filin Ya.A. Postroenie universal'noj modeli golosovykh poddelok na osnove gaussovykh smesej [Elektronnyj resurs] / URL: https://elibrary.ru/ item.asp?id=28784584 (data obrashcheniya: 28.01.2018).
  15. Viroli C., McLachlan G.J. Deep Gaussian mixture models [Elektronnyj resurs] / URL: https://arXiv.org/abs/1711.06929v1 (data obrashcheniya: 31.01.2018).
  16. Andreev O.A., Trofimov A.T. Poligaussovskaya veroyatnostnaya model' energeticheskikh spektrov nizkochastotnykh gidroakusticheskikh signalov // Sb. trudov XIII vseross. konf. «Prikladnye tekhnologii gidroakustiki i gidrofiziki». Sankt-Peterburg. 2016. S. 399–401.
  17. Olech L.P., Paradowski M. Hierarchical Gaussian mixture model with objects attached to terminal and non-terminal dendrogram nodes [Elektronnyj resurs] / URL: https://arxiv.org/abs/1603.08342v1 (data obrashcheniya: 7.02.2018).
  18. Malsiner-Walli G., Frühwirth-Schnatter S., Grün B. Identifying of mixtures using Bayesian estimation [Elektronnyj re-surs] / URL: https://arxiv.org/ abs/1502.06449v3 (data obrashcheniya: 07.02.2018).
  19. Andreev O.A. Realizatsiya poligaussovskikh algoritmov klassifikatsii na osnove ispol'zovaniya arkhitektury i operatorov iskusstvennykh nejronnykh setej [Elektronnyj resurs] // Informatsionnye tekhnologii v proektirovanii i proizvodstve. 2017. № 3. S. 60–64. URL: https://elibrary.ru/item. asp?id=29992160 (data obrashcheniya: 31.01.2018).

© Издательство «РАДИОТЕХНИКА», 2004-2017            Тел.: (495) 625-9241                   Designed by [SWAP]Studio