Abstract:An enhanced YOLOv8n-based object detection algorithm, SFE-YOLO, is developed to tackle the issues of low detection precision for small targets in UAV aerial photography. Initially, a shallow feature enhancement module is embedded to integrate the shallow spatial details of input features with deep semantic information obtained from the neck section. This fusion strengthens the representation capability for small target features. Additionally, a global context block (GC-Block) is utilized to recalibrate this merged information, effectively suppressing background noise. Subsequently, the network’s adaptability to geometric changes is increased by substituting deformable convolutions for some standard convolutions in the C2F layer. Furthermore, the ASPPF module, incorporating average pooling technology, is integrated to augment the model’s expression of multi-scale features and to decrease miss rates. Finally, a novel weighted feature fusion method is designed. This method blends more intermediate features from the main network, enabling smoother transitions among different scale features and augmenting feature reusability through skip connections. The model’s performance is validated on VisDrone2019 and VOC2012 datasets, achieving mAP@0.5 values of 30.5% and 67.3%, respectively. These results mark improvements of 3.6% and 0.8% over the baseline YOLOv8n algorithm, demonstrating enhanced performance in UAV image target detection and notable generalization capabilities.