A.I. Vlasov – Ph.D. (Tech.), Associate Professor, Department “Design and production technology of electronic equipment" (IU4), Bauman Moscow State Technical University,
E-mail: vlasovai@bmstu.ru
S.Yu. Papulin – Ph.D. (Tech.), Associate Professor Department "Computer Systems and Networks" (IU6), Bauman Moscow State Technical University,
E-mail: papulin@rambler.ru
The article discusses the issues of image analysis by means of logical multiple histogram representation for a combination of features. The specifics of applying the logical multiple histogram representation for each feature separately and a single histogram for all features combined are analyzed. In order to solve this issue using the first approach, it is necessary to introduce additional mathematical definitions that may complicate the implementation of the method. At the same time, the data dimension will correspond to the sum of the number of elements of universal sets for all features. If we still take the set of regions into account, then the whole set of features histograms is constructed for each of them. In case of implementing the second approach, the standard mathematical apparatus of the logical multiple histogram representation is used, but the dimension of the data increases due to the need to multiply the elements of the universal sets of all features.
As a further development of the logical multiple histogram representation, special attention should be paid to the possibility of using a spatial attribute for data consisting of many regions.
As a result of the research, it was found that the choice of approach to the presentation of a combination of features is of paramount importance in the logical multiple histogram representation since both the accuracy of the result and the technique used to obtain quantitative indicators of the presence of an elemental query depend on this. For simplicity of the material presentation, the analysis of images with two signs (color and texture) is provided. However, this approach can be generalized for the case when the data are determined by a combination of features that were previously practically not considered. For example, in textual documents, one can distinguish thematic features in the form of terms frequency vectors. If we talk about images, these are primarily signs of color, texture, shape, position of both individual regions and the entire image, which can also be represented as a frequency vector – a histogram.
- Hinton G., Krizhevsky A., Sutskever I. ImageNet classification with deep convolutional neural networks // Neural Information Processing Systems (NIPS). 2012. P. 1106-1114. [Electronnyy resurs] URL: = https://papers.nips.cc/paper/4824-imagenetclassification-with-deep-convolutional-neural-networks.pdf (data obrashcheniya: 15.03.2017).
- Canziani A., Culurciell, E., Paszke A. An Analysis of Deep Neural Network Models for Practical Applications // Computing Research Repository (CoRR). 2017. [Electronnyy resurs] URL: = https://arxiv.org/pdf/1605.07678.pdf (data obrashcheniya: 15.03.2017).
- Krause J., Johnson J., Krishna R., Fei-Fei L. A Hierarchical Approach for Generating Descriptive Image Paragraphs. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017. P. 3337–3345.
- Chen X., Zitnick C.L. Mind’s eye: A recurrent visual representation for image caption generation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2015. P. 2422–2431.
- Darrell T., Donahue J., Guadarrama S., Hendricks L.A., Rohrbach M., Saenko K., Venugopalan S. Long-term recurrent convolutional networks for visual recognition and description. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2017. V. 39. № 4. P. 677–691.
- Darrell T., Long J., Shelhamer E. Fully Convolutional Networks for Semantic Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2017. V. 39. № 4. P. 640–651.
- Badrinarayanan V., Cipolla R., Kendall A. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmenta-tion. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2017. V. 39. № 12. P. 2481–2495.
- Dai J., He K., Li Y., Ren S., Sun J. Instance-Sensitive Fully Convolutional Networks. European Conference on Computer Vision. 2016. P. 534-549.
- Hayder Z., He X., Salzmann M. Boundary-Aware Instance Segmentation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017. P. 587–595.
- Cöster R., Sahlgren M. Using bag-of-concepts to improve the performance of support vector machines in text categorization. COLING '04 Proceedings of the 20th international conference on Computational Linguistics. 2004. Article № 487. [Electronnyy resurs] URL: = https://www.sics.se/~mange/papers/coling2004.pdfnetworks.pdf (data obrashcheniya: 15.03.2017).
- Sivic J., Zisserman A. Video Google: Efficient Visual Search of Videos. In: Toward Category-Level Object Recognition. Lecture Notes in Computer Science. Springer Berlin Heidelberg. 2006. V. 4170. P. 127-144. [Electronnyy resurs] URL: = http://www.robots.ox.ac.uk/~vgg/publications/papers/sivic06c.pdf (data obrashcheniya: 15.03.2017).
- Chen K., Corrado G., Dean J., Mikolov T. Efficient Estimation of Word Representations in Vector Space. Computing Research Repository (CoRR). 2013. [Electronnyy resurs] URL: = https://arxiv.org/pdf/1301.3781.pdf (data obrashcheniya: 15.03.2017).
- Papulin S.Yu. Analiz kollektsii dannykh posredstvom logiko-mnozhestvennogo gistogrammnogo predstavleniya. Programmnyye produkty i sistemy. 2014. №1. S. 57–60 (in Russian).
- Papulin S.Yu. Analiz mnogoblochnykh dannykh posredstvom logiko-mnozhestvennogo gistogrammnogo predstavleniya. Nauchnoye obozreniye. 2014. №2. S. 72–77 (in Russian).