Abstract:To address the low accuracy and high miss detection rates in pedestrian detection caused by complex background interference, this study proposes an adaptive dual-branch dense pedestrian detection algorithm, DACD-YOLO, incorporating improved attention mechanisms. First, the backbone network employs an adaptive dual-branch structure, which fuses different features through dynamic weighting while introducing depthwise separable convolution to reduce the computational cost, effectively mitigating the information loss present in traditional single-branch networks. Second, an adaptive vision center is proposed to enhance intra-layer feature extraction through dynamic optimization, with channel numbers reconfigured to balance accuracy and computational load. A coordinate dual-channel attention mechanism is then introduced, combining a heterogeneous convolution kernel design within a lightweight fusion module to reduce computational complexity and improve the capture of key features. Lastly, a dilation convolution detection head is utilized, fusing multi-scale features through convolutions with varying dilation rates, effectively enhancing feature extraction for small and occluded objects. Experimental results show that, compared to the original YOLOv8n, the proposed algorithm improves mAP@0.5 and mAP@0.5:0.95 by 2.3% and 2.2%, respectively, on the WiderPerson dataset, and by 3.5% and 4.6%, respectively, on the CrowdHuman dataset. The experiments demonstrate that the proposed algorithm significantly enhances accuracy in dense pedestrian detection compared to the original method.