O.I. Garanin – Post-graduate Student, National research university “MPEI” (Moscow)
In this article we analyze the existing methods for selecting receptive field of convolutional neural network (CNN) in tasks such as: semantic segmentation, object detection.
These methods have limitations: complexity of the threshold selection and the large number of noisy images required for training a convolutional neural network.
We propose a method for selecting receptive field of convolutional neural network which differs from existing ones by reducing the required number of noisy images when learning a convolutional neural network, and also by the step of including the pixels of the original image in a receptive field that allows selecting the threshold for such inclusion adaptively.
The proposed method consists of the following steps:
Step 1. Select for each layer of the CNN 20 images from the test images dataset that have the highest activation of the neuron for the given layer.
Step 2. Build a discrepancy map for each of the 20 images.
Step 3. Including image pixels into the empirical receptive field of the CNN neuron and calculating the size of this field for each of the 20 images.
Step 4. Calculation of the size of the empirical receptive field of the layer for each CNN layer.
Experimentally using the ImageNet-CNN on the training and test dataset VOC-2007 it was shown that the proposed method allows to reduce the required number of noisy images by an average of 10 times.
Long J., Shelhamer
E., Darrell T. Fully convolutional networks for semantic segmentation // CVPR. 2015.
Krizhevsky A., Sutskever
I., Hinton G. Imagenet classification with deep convolutional neural networks //
Proceedings of the Neural Information Processing Systems conference, NIPS. 2012.
Liu W., Anguelov D.,
Erhan D., Szegedy C., Reed S. E. SSD: single shot multibox detector // CoRR. 2015.
Jia Y., Shelhamer
E., Donahue,J., Karayev S., Long J., Girshick R., Guadarrama S., Darrell T. Caffe:
Convolutional architecture for fast feature embedding // MM. 2014.
Avedjan EH.D., Galushkin
A.I., Selivanov S.A. Sravnitelnyjj analiz struktur polnosvjaznykh i svertochnykh nejjronnykh
setejj i ikh algoritmov obuchenija // Informatizacija i svjaz. 2017. № 1.
Long J., Zhang N.,
Darrell T. Do Convnets Learn Correspondence // NIPS. 2014.
Wenjie L.,Yujia L.,Urtasun R., Zemel R. Understanding
the effective receptive field in deep convolutional neural networks // Proceedings
of the neural information processing systems conference, NIPS. 2016.
Zhou B., Khosla A.,
Lapedriza A., Oliva A., Torralba A. Object detectors emerge in deep scene cnns //
Everingham M., Van
Gool L., Williams C. K., Winn J., Zisserman A. The pascal visual object classes
(voc) challenge // International journal of computer vision. 2010. № 88(2). P. 303–338.