Abstract:In clinical practice, accurate pain assessment is crucial for pain management and diagnosis. However, traditional assessment methods are highly subjective and reliant on the expertise of medical professionals, highlighting the urgent need for more reliable and objective alternatives. The research on pain detection based on facial expression by deep learning has made remarkable progress in recent years, whereas the complex structure and high computational cost restrict its practical application. Therefore, this study proposes an improved 3D convolutional neural network (CNN) that utilizes a lightweight 3D CNN named L3D as the backbone network. It also incorporates an enhanced SE attention mechanism to fuse multiple features of different scales, capturing spatiotemporal characteristics with strong discriminative power in pain sequences. The proposed method is evaluated on UNBC-McMaster and BioVid datasets. Compared with the state-of-the-art methods, the proposed method achieves superior performance in pain detection and computational complexity.