350 rub
Journal Neurocomputers №4 for 2016 г.
Article in number:
The data clustering on the base of Self-organizing incremental neural networks and Markov clustering algorithm
Keywords:
self-organizing incremental neural network (SOINN)
kernel density estimation
online unsupervised learning
to-pology learning
cluster analysis
data mining
Markov clustering algorithm (MCL)
Normalized mutual information
F-score
Authors:
Yu.S. Fedorenko - Post-Graduate Student, Bauman Moscow State technical University. E-mail: Fedyura1992@yandex.ru
Yu.E. Gapanyuk - Ph.D. (Eng.), Associate Professor, Bauman Moscow State technical University. E-mail: gapyu@bmstu.ru
Abstract:
The new clustering technique on the base of Self-organizing incremental neural networks (SOINN) and Markov clustering al-gorithm (MCL) is presented. Existing clustering methods are analyzed and their advantages and disadvantages are identified. The mathematical background of SOINN is considered and possibilities of using this type of neural networks for density esti-mation are discussed. The base architecture of SOINN networks and its algorithm are described. The problems of data clustering with using of SOINN are demonstrated. The expediency of applying normalized mutual information and f-score for clustering quality evaluation is explained. The experimental comparisons of six clustering algorithms, including proposed method, on six test dataset by two clustering measures are conducted. The obtained results are analyzed and the efficiency of proposed clustering technique is proved.
Pages: 3-13
References
- Ajjvazjan S. A., Bukhshtaber V. M., Enjukov I. S., Meshalkin L. D. Prikladnaja statistika: klassifikacija i snizhenie razmernosti / pod red. S. A. Ajjvazjana. M.: Finansy i statistika. 1989. 608 s.
- Labunec L. V., Labunec N. L., CHizhov M. JU. Rekurrentnye statistiki nestacionarnykh vremennykh rjadov // Radiotekhnika i ehlektronika. 2011. № 12. S. 1468-1489.
- Segaran T. Programmiruem kollektivnyjj razum: per. s angl. A. Slinkina. SPb.: Simvol-Pljus. 2008. 368 s. [Segaran T. Programming Collective Intelligence. O-REILLY, 2008. 368 p.].
- Eick R. Density-based Clustering. Available at: http://www2.cs.uh.edu/~ceick/ML/Topic9.ppt (accessed 27.05.15).
- Silverman B. W. Density Estimation for Statistics and Data Analysis. - Springer Science Business Media. 1986. 175 p.
- Gutierrez-Osuna R. L7: Kernel density estimation. Available at: http://research.cs.tamu.edu/prism/lectures/pr/pr_l7.pdf (accessed 27.05.15).
- Xiao X., Zhang H., Hasegawa O. Density Estimation Method Based on Self-Organizing Incremental Neural Network and Error Estimation // Proceedings of the Neural Information Processing: 20th International Conference, ICONIP 2013. Daegu, Korea. 2013. R. 43-50.
- Furao S., Hasegawa O.An incremental network for on-line unsupervised classification and topology learning // Neural Networks. 2005. № 4. 1-17 p.
- Furao S., Ogura T., Hasegawa O. An enhanced self-organizing incremental neural network for online unsupervised learning // Neural Networks. 2007. № 6. R. 893-903.
- Stijn Van Dongen.Graph clustering via a discrete uncoupling process. Siam Journal on Matrix Analysis and Applications (SIAM) 30-1. Society for Industrial and Applied Mathematics. Philadelphia. USA, 2008. R. 121-141.
- Zahn C.T. Graph-theoretical methods for detecting and describing gestalt clusters // IEEE Transactions on Computers.1971. R. 68-86.
- Jain A., Law M. Data clustering: a user-s dilemma // Lecture Notes in Computer Science. 2005. V. 3776. R. 1-10.
- Chang H., Yeung D. Y.Robust path-based spectral clustering. Pattern Recognition. 2008.R. 191-203.
- Fisher R. A. The use of multiple measurements in taxonomical problems. Annals of Eugenics 7.1936. R. 179-188.
- Manning K.D., Ragkhavan P., SHjutce KH. Vvedenie v informacionnyjj poisk: Per. s angl. M.: OOO «I.D. Viljams». 2011. 528 s. [Manning C.D., Raghavan P., Schutze H.Introduction to Information Retrieval. Cambridge University Press. 2010. 521p.].
- Gray M.R. Entropy and Information Theory. First edition. Springer-Verlag. New York. 2013. 311p.
- Pugh J.K., Stanley K.O. Evolving Multimodal Controllers with HyperNEAT. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2013). New York. NY: ACM. 2013. 8 p.
- Samokhvalov EH.N., Revunkov G.I., Gapanjuk JU.E. Ispolzovanie metagrafov dlja opisanija semantiki i pragmatiki informacionnykh sistem. Vestnik MGTU im. N.EH. Baumana. Ser. «Priborostroenie». 2015. Vyp. № 1.