Improvement of Yolov8 Object Detection Based on Lightweight Neck Model for Complex Images

Tien-Wen Sung; Jie Li; Chao-Yang Lee; Qingjun Fang

doi:10.5566/ias.3514

Authors

Tien-Wen Sung Fujian University of Technology
Jie Li Fujian University of Technology
Chao-Yang Lee National Yunlin University of Science and Technology
Qingjun Fang Fujian University of Technology

DOI:

https://doi.org/10.5566/ias.3514

Keywords:

Lightweight, Attention Mechanism, Target Detection, Neck Networks

Abstract

With the advancement of target detection technology, the need for accurate detection of complex scenes is becoming increasingly important in various industries. This can not only improve productivity, but also ensure public safety. However, the current mainstream target detection algorithms have some problems in dealing with complex scenes, for example, some detection models are not able to detect in real time, and the accuracy of the model is degraded when facing disturbing factors such as target occlusion, and low-contrast scenes. In order for these problems to be mitigated, this paper proposes a lightweight convolution LDGConv (Lightweight-DepthGhost Convolution), which is utilized to improve the YOLOv8 network model by replacing part of the traditional convolution of the Neck network with this convolution, and improving the bottleneck module in a lightweight way. In addition, we add the Coordinate Attention mechanism to the Neck part. Our proposed model improves mAP50 on the VOC dataset by 1.3% while reducing computation and parameters by 9.8% and 15.3%, respectively, compared to the original model. In the experiments on the steel surface defects dataset NEU-DET, our model overall outperforms the current mainstream detection models. The model is capable of high-precision and low-computational-cost target detection, thus saving labor costs and improving public health and safety and productivity.

References

Alhichri H, Alswayed AS, Bazi Y, Ammour N, Alajlan NA (2021). Classification of remote sensing images using efficientnet-b3 cnn model with attention. IEEE access 9:14078–94.

Cao M, Fu H, Zhu J, Cai C (2022). Lightweight tea bud recognition network integrating ghostnet and yolov5. Mathematical biosciences and engineering MBE 19:12897–914.

Carreira J, Madeira H, Silva JG (1998). Xception: A technique for the experimental evaluation of dependability in modern computers. IEEE Transactions on Software Engineering 24:125–36.

Fan Q, Brown L, Smith J (2016). A closer look at faster r-cnn for vehicle detection. In: 2016 IEEE intelligent vehicles symposium (IV). IEEE.

Fu H, Song G, Wang Y (2021). Improved yolov4 marine target detection combined with cbam. Symmetry 13:623.

Gai R, Chen N, Yuan H (2023). A detection algorithm for cherry fruits based on the improved yolov4 model. Neural Computing and Applications 35:13895–906.

Hou Q, Zhou D, Feng J (2021). Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.

Humphreys GW, Sui J (2016). Attentional control and the self: The self-attention network (san). Cognitive neuroscience 7:5–17.

Jiang S, Xu T, Li J, Huang B, Guo J, Bian Z (2019). Identifynet for non-maximum suppression. IEEE Access 7:148245–53.

Li B, He Y (2018). An improved resnet based on the adjustable shortcut connections. Ieee Access 6:18967–74.

Li C, Wang R, Li J, Fei L (2020a). Face detection based on yolov3. In: Recent Trends in Intelligent Computing, Communication and Devices: Proceedings of ICCD 2018. Springer.

Li F, Bai H, Zhao Y (2020b). Learning a deep dual attention network for video super-resolution. IEEE transactions on image processing 29:4474–88.

Li Y, Huang H, Xie Q, Yao L, Chen Q (2018). Research on a surface defect detection algorithm based on mobilenet-ssd. Applied Sciences 8:1678.

Liu M, Wu W, Gu Z, Yu Z, Qi F, Li Y (2018). Deep learning based on batch normalization for p300 signal detection. Neurocomputing 275:288–97.

Meng R, Rice SG, Wang J, Sun X (2018). A fusion steganographic algorithm based on faster r-cnn. Computers Materials Continua 55.

Michele A, Colin V, Santika DD (2019). Mobilenet convolutional neural networks and support vector machines for palmprint recognition. Procedia Computer Science 157:110–7.

Mnih V, Heess N, Graves A, et al. (2014). Recurrent models of visual attention. Advances in neural information processing systems 27.

Nazir T, Nawaz M, Rashid J, Mahum R, Masood M, Mehmood A, Ali F, Kim J, Kwon HY, Hussain A (2021). Detection of diabetic eye disease from retinal images using a deep learning based centernet model. Sensors 21:5283.

Rabbi MF, Sultan MN, Hasan M, Islam MZ (2023). Tribal dress identification using convolutional neural network. J Inf Hiding Multim Signal Process 14:72–80.

Redmon J, Divvala S, Girshick R, Farhadi A (2016). You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition.

Ren M, Zhang X, Chen X, Zhou B, Feng Z (2023). Yolov5s-m: A deep learning network model for road pavement damage detection from urban street-view imagery. International Journal of Applied Earth Observation and Geoinformation 120:103335.

Ren S, He K, Girshick R, Sun J (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28.

Tammina S (2019). Transfer learning using vgg-16 with deep convolutional neural network for classifying images. International Journal of Scientific and Research Publications IJSRP 9:143–50.

Tawfeeq LA, Hussein SS, jasem Mohammed M, Abood SS (2021). Predication of most significant features in medical image by utilized cnn and heatmap. J Inf Hiding Multim Signal Process 12:217–25.

Wang CY, Liao HYM, Wu YH, Chen PY, Hsieh JW, Yeh IH (2020a). Cspnet: A new backbone that can enhance learning capability of cnn. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops.

Wang J, Yang L, Huo Z, He W, Luo J (2020b). Multi-label classification of fundus images with efficientnet. IEEE access 8:212499–508.

Wang Q, Zhang X, Chen G, Dai F, Gong Y, Zhu K (2018). Change detection based on faster r-cnn for high-resolution remote sensing images. Remote sensing letters 9:923–32.

Wang Z, Xu K, Wu S, Liu L, Liu L, Wang D (2020c). Sparse-yolo: Hardware/software codesign of an fpga accelerator for yolov2. IEEE Access 8:116569–85.

Wu TH, Wang TW, Liu YQ (2021). Real-time vehicle and distance detection based on improved yolov5 network. In: 2021 3rd World Symposium on Artificial Intelligence (WSAI). IEEE.

Xiong C, Zayed T, Abdelkader EM (2024). A novel yolov8-gam-wise-iou model for automated detection of bridge surface cracks. Construction and Building Materials 414:135025.

Xu Y, Yi J, Gao J (2023). Defect detection of automotive leather based on nanodet-plus. In: 2023 35th Chinese Control and Decision Conference (CCDC). IEEE.

Yang Am, Jiang Ty, Han Y, Li J, Li Yf, Liu Cy (2022). Research on application of on-line melting in-situ visual inspection of iron ore powder based on faster r-cnn. Alexandria Engineering Journal 61:8963–71.

Zhang D, Hao X, Liang L, Liu W, Qin C (2022a). A novel deep convolutional neural network algorithm for surface defect detection. Journal of Computational Design and Engineering 9:1616–32.

Zhang K, Shen H (2021). Solder joint defect detection in the connectors using improved fasterrcnn algorithm. Applied Sciences 11:576.

Zhang X, Zeng H, Guo S, Zhang L (2022b). Efficient long-range attention network for image superresolution. In: European Conference on Computer Vision. Springer.

Zhao P, Zhang J, Fang W, Deng S (2020). Scaunet: spatial-channel attention u-net for gland segmentation. Frontiers in Bioengineering and Biotechnology 8:670.

Zhao X, Song Y (2023). Improved ship detection with yolov8 enhanced with mobilevit and gsconv. Electronics 12:4666.

Zhou K, Tong Y, Li X, Wei X, Huang H, Song K, Chen X (2023). Exploring global attention mechanism on fault detection and diagnosis for complex engineering processes. Process Safety and Environmental Protection 170:660–9.

Zhu L, Lee F, Cai J, Yu H, Chen Q (2022). An improved feature pyramid network for object detection. Neurocomputing 483:127–39.