Abstract:The rapid growth of security inspection demand drives the development of intelligent security inspection technology. Due to the unique characteristics of X-ray images, detecting small contraband items is challenging. This study proposes an improved YOLOv8s network for contraband recognition to address this issue. Firstly, the Focal L1 Loss function is introduced to enhance CIoU and optimize the position and aspect ratio of prediction boxes to improve the network’s ability to identify contraband items. Improved deformable convolution is added to the shallow backbone network to capture features of contraband items in different directions. LSKA is incorporated into the SPPF module to expand the network’s receptive field, while the Swin-CS module captures global information and supplements dimensional interaction. Finally, three stacked attention blocks are used for processing, enhancing the network’s sensitivity towards small targets. The improved network achieves an average precision mean of 96.1% on the SIXray dataset, a 5.4% improvement over YOLOv8s with mAP50-95 reaching 0.682, a 4.5% increase. Experimental results indicate that the proposed model can accurately generate prediction boxes, effectively handle contraband detection in complex scenarios, and validate algorithm effectiveness.