Abstract:Deformable 3D medical image registration remains challenging due to irregular deformations of human organs. This study proposes a multi-scale deformable 3D medical image registration method based on Transformer. Firstly, the method adopts a multi-scale strategy to realize multi-level connections to capture different levels of information. Self-attention mechanism is employed to extract global features, and dilated convolution is used to capture broader context information and more detailed local features, so as to enhance the registration network’s fusion capacity for global and local features. Secondly, according to the sparse prior of the image gradient, the normalized total gradient is introduced as a loss function, effectively reducing the interference of noise and artifacts on the registration process, and better adapting to different modes of medical images. The performance of the proposed method is evaluated on publicly available brain MRI datasets (OASIS and LPBA). The results show that the proposed method can not only maintain the advantages of the learning-based method in run-time but also well performs in mean square error and structural similarity. In addition, ablation experiment results further prove the validity of the method and normalized total gradient loss function design proposed in this study.