Machine vision system for determining the spatial position of apples based on YOLOv3 and stereo camera

350 rub

Journal Neurocomputers №5 for 2022 г.

Article in number:

Type of article: scientific article

DOI: https://doi.org/10.18127/j19998554-202205-08

UDC: 634.1-13+004.896

Keywords: Pattern recognition stereovision YOLOv3 Intel RealSense data aggregation agriculture apple picking

Authors:

N.A. Andriyanov1, I.Y. Khasanshin2, D.S. Utkin3, Ahmad Aws4, N.N. Kovylov5, A.A. Kochkarov6

1−6 Financial University under the Government of the Russian Federation (Moscow, Russia)

Abstract:

The relevance of robotics in agriculture is due to a decrease in the availability of labor, the high labor intensity of work in agriculture, as well as the need to create jobs in related industries to produce equipment for agriculture, increase the content of labor and attract young people to the industry, improve the quality of agricultural products, ensure labor safety, increase labor productivity and reduce personnel risks. At the same time, there are areas in agriculture that are most difficult to automate, for example, the collection and sorting of apples. What a person can easily cope with requires complex mechatronic, robotic solutions with the implementation of machine vision and deep learning technologies. Despite the great capabilities of modern neural network architectures in relation to the tasks of object detection and recognition, the output of such models is the local (pixel) coordinates of objects in the image and their predicted classes. However, in a few practical tasks, it is necessary to obtain more complete information about the object from the image. For robotic apple picking, it is necessary to clearly understand where and how much to move the exciting device. To determine the real position of the apple relative to the source of image registration, it is proposed to use an Intel RealSense stereo camera and combine information from its range and brightness channels. The detection of apples is carried out using the YOLOv3 architecture, then, based on the data on the distance to the object and its localization in the image, relative distances are calculated for all coordinates. The proposed approach allows us to obtain location estimates with high accuracy, the approximate error is 7–12 mm, depending on the range.

Pages: 74-84

For citation

Andriyanov N.A., Khasanshin I.Y., Utkin D.S., Ahmad Aws, Kovylov N.N., Kochkarov A.A. Machine vision system for determining
the spatial position of apples based on YOLOv3 and stereo camera. Neurocomputers. 2022. V. 24. № 5. Р. 74-84.
DOI: https://doi.org/10.18127/j19998554-202205-08 (in Russian)

References

Cho W., Kim S., Na M., Na I. Forecasting of Tomato Yields Using Attention-Based LSTM Network and ARMA Model. Electronics. 2021. 10. 1576. https://doi.org/10.3390/electronics10131576.
López-Morales J.A., Martínez J.A., Skarmeta A.F. Digital Transformation of Agriculture through the Use of an Interoperable Platform. Sensors. 2020. 20. 1153. https://doi.org/10.3390/s20041153.
Rolandi S., Brunori G., Bacco M., Scotti I. The Digitalization of Agriculture and Rural Areas: Towards a Taxonomy of the Impacts. Sustainability. 2021. 13. 5172. https://doi.org/10.3390/su13095172.
https://www.un.org/en/global-issues/population.
Bahn R.A., Yehya A.A.K., Zurayk R. Digitalization for Sustainable Agri-Food Systems: Potential, Status, and Risks for the MENA Region. Sustainability. 2021. 13. 3223. https://doi.org/10.3390/su13063223.
Joseph Redmon, Ali Farhadi YOLOv3: An Incremental Improvement. https://arxiv.org/abs/1804.02767.
Titov V.S., Spevakov A.G., Primenko D.V. Multispectral optoelectronic device for controlling an autonomous mobile platform. Computer Optics. 2021; 45(3): 399−404. DOI: 10.18287/2412-6179-CO-848.
https://www.intelrealsense.com/depth-camera-d415/.
Girshick R., Donahue J., Darrell T., Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2014. 1: 580−587.
Krizhevsky A., Sutskever I., Hinton G.E. ImageNet classification with deep convolutional neural networks. Proceedings of 26th Conference on Neural Information Processing Systems (NeurIPS)2012. 1: 1106−1114.
Girshick R. Fast R‑CNN. Source: https://arxiv.org/abs/1504.08083.
Ren S., He K., Girshick R., Sun J. Faster R‑CNN: Towards Real-Time Object Detection with Region Proposal Networks. https://arxiv.org/abs/1506.01497.
Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi You Only Look Once: Unified, Real-Time Object Detection. https://arxiv.org/abs/1506.02640.
Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg SSD: Single Shot MultiBox Detector. https://arxiv.org/abs/1512.02325.
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár. Focal Loss for Dense Object Detection. https://arxiv.org/abs/1708.02002.
Redmon J., Farhadi A. YOLOv3: An Incremental Improvement. https://arxiv.org/abs/1804.02767.
DarkNet-53. https://github.com/pjreddie/darknet.
Andriyanov N., Dementiev V., Kondratiev D. Tracking of Objects in Video Sequences. Smart Innovation. Systems and Technologies. 2021. 238. P. 253−262.
Kuznetsova A., Maleva T., Soloviev V. Using YOLOv3 Algorithm with Pre- and Post-Processing for Apple Detection in Fruit-Harvesting Robot. Agronomy. 2020. 10. 1016. https://doi.org/10.3390/agronomy10071016.
Yan B., Fan P., Lei X., Liu Z., Yang F. A Real-Time Apple Targets Detection Method for Picking Robot Based on Improved YOLOv5. Remote Sens. 2021. 13. 1619. https://doi.org/10.3390/rs13091619.
Huang Z., Zhang P., Liu R., Li D. Immature Apple Detection Method Based on Improved Yolov3. ASP Transactions on Internet of Things. 2021. 1(1). 9−13. https://doi.org/10.52810/TIOT.2021.100028.
Andriyanov N.A., Andriyanov D.A. The using of data augmentation in machine learning in image processing tasks in the face of data scarcity. Journal of Physics: Conference Series. 2020. 1661(1). 012018.
Xuan G. et al. Apple Detection in Natural Environment Using Deep Learning Algorithms. IEEE Access. 2020. V. 8. P. 216772−216780. doi: 10.1109/ACCESS.2020.3040423.
Kenta Itakura, Yuma Narita, Shuhei Noaki, Fumiki Hoso Automatic pear and apple detection by videos using deep learning and a Kalman filter. OSA Continuum. 2021. 4. 1688−1695.
Gómez-Espinosa A.; Rodríguez-Suárez J.B.; Cuan-Urquizo E.; Cabello J.A.E.; Swenson R.L. Colored 3D Path Extraction Based on Depth-RGB Sensor for Welding Robot Trajectory Generation. Automation. 2021. 2. 252−265. https://doi.org/10.3390/ automation2040016.
Servi M.; Mussi E.; Profili A.; Furferi R.; Volpe Y.; Governi L.; Buonamici F. Metrological Characterization and Comparison of D415, D455, L515 RealSense Devices in the Close Range. Sensors. 2021. 21. 7770. https://doi.org/10.3390/s21227770.
Maru M.B.; Lee D.; Tola K.D.; Park S. Comparison of Depth Camera and Terrestrial Laser Scanner in Monitoring Structural Deflections. Sensors. 2021. 21. 201. https://doi.org/10.3390/s21010201.
Laganiere R.; Gilbert S.; Roth G. Robust object pose estimation from feature-based stereo. IEEE Trans. Instrum. Meas. 2006. 55. 1270−1280.
Kulikov A.A. Algoritm videomonitoringa dlja modeli reprinta ob’ektov na izobrazhenii. Informacionno-izmeritel'nye i upravljajushhie sistemy. 2022. T. 20. № 4. S. 54−59 (In Russian).

Date of receipt: 18.08.2022

Approved after review: 01.09.2022

Accepted for publication: 22.09.2022