350 rub
Journal Information-measuring and Control Systems №2 for 2017 г.
Article in number:
Weakly-sparse filters in image recognition
Keywords:
convolution
weakly-sparse filters
Gabor
neural networks
support vector machine
machine learning
digits recognition
Authors:
B.A. Knjazev - Research Engineer, Bauman Moscow State Technical University
E-mail: bknyazev@bmstu.ru
V.M. Chernenkiy - Dr.Sc. (Eng.), Head of Department of Information processing system and Management, Bauman Moscow State Technical University
E-mail: chernen@bmstu.ru
Abstract:
In this article, we consider the problem of handwritten images classification. The MNIST dataset with 70 thousands images is used. The objective of this study is creation of a mathematical model, which reaches the accuracy of convolutional neural networks and at the same time has low computational cost. Computational cost is measured as the number of model pa-rameters which depends on the number of filters and their size. For this purpose, a new type of filters is introduced, which we call weakly-sparse filters. They differ from other filters, because they satisfy certain constraints on their spatial and frequency characteristics, which we impose. We suggest an algorithm for automatic formation of weakly-sparse filters from training images. This algorithm is built upon the recursive autoconvolution operator introduced earlier, clustering methods such as k-means and the heuristic selection procedure based on the trial-error principle.
Our image processing and classification model is based on convolutional networks trained in an unsupervised way. Features obtained using our single layer architecture are projected with principal components analysis. To solve the super-vised task, support vector machines with a nonlinear kernel is used.
In the experimental part, the proposed model is evaluated on the MNIST dataset. Our model reaches high classification accuracy with very few filters. It is possible, because weakly-sparse filters introduced in this work have complex shape and rich spatial and frequency properties, so that they allow to detect complex features in images. An error of 0.44% is achieved with 25 filters, 0.36% with 200 filters and 0.32% with a combination of two models with 200 filters. According to our analysis, most of the filters in these models are weakly-sparse. This way, effectiveness of the model compared to convolutional neural networks is confirmed and the objective of this work is reached.
Pages: 49-56
References
- LeCun Y., Bottou L., Bengio Y., Haffner P. Gradient-based learning applied to document recognition // Proceedings of the IEEE. 1998. V. 86. № 11. P. 2278-2324. DOI: 10.1109/5.726791.
- Ciresan D.C., Meier U., Gambardella L. M., Schmidhuber J. Deep Big Simple Neural Nets Excel on Handwritten Digit Recognition // arXiv.org, 2010. Art. no. arXiv: arXiv:1003.0358.
- Cireşan D., Meier U., Schmidhuber J. Multi-column Deep Neural Networks for Image Classification // 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR-12). IEEE, 2012. P. 3642-3649. DOI:10.1109/CVPR.2012.6248110.
- Mairal J., Koniusz P., Harchaoui Z., Schmid C. Convolutional Kernel Network // arXiv.org, 2014. Art. no. arXiv:1406.3332.
- Labusch K., Barth E., Martinetz T. Simple Method for High-Performance Digit Recognition Based on Sparse Coding // IEEE Transactions on Neural Networks. 2008. V. 19. № 11. P. 1985-1989. DOI: 10.1109/TNN.2008.2005830.
- Knjazev B.A., CHernenkijj V.M. Svertochnoe razrezhennoe predstavlenie izobrazhenijj dlja analiza staticheskikh i dinamicheskikh obrazov // Nauka i obrazovanie. 2014, 11. DOI:10.7463/1114.0730860.
- Knjazev B.A., CHernenkijj V.M. Metodika i model klasterizacii patternov dvigatelnojj aktivnosti lica kak preobrazovanijj metagrafov // Vestnik MGTU im. N.EH. Baumana. Ser. Priborostroenie. 2014. № 4. S. 34-54.
- Daugman J.G. Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters // Journal of the Optical Society of America. 1985. V. 2, № 7. P. 1160-1169.
- Vedaldi A., Fulkerson B. VLFeat: An Open and Portable Library of Computer Vision Algorithms // 2008. Rezhim dostupa: http://www.vlfeat.org (data obrashhenija 01.09.2015).
- Nguyen M.H., De la Torre F. Optimal Feature Selection for Support Vector Machines // Pattern Recognition. 2010. № 43(3). P. 584-591.
- Cortes C., Vapnik V. Support-Vector Networks // Machine Learning. 1995. V. 20. № 3. P. 273-297. DOI: 10.1007/BF00994018
- Chang C.-C., Lin C.-J. LIBSVM: A library for support vector machines // ACM Transactions on Intelligent Systems and Technology. 2011. V. 2. Iss. 3. Article № 27. DOI: 10.1145/1961189.1961199.
- Liang M., Hu X. Recurrent Convolutional Neural Network for Object Recognition // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2015. P. 3367-3375.
- Goodfellow I.J., Warde-Farley D., Mirza M., Courville A., Bengio Y. Maxout networks // arXiv.org, 2013. Art. no. arXiv:1302.4389.