Abstract:Image segmentation has gradually developed from traditional threshold-based methods to convolutional neural network (CNN)-based methods. Traditional CNNs are outstanding in the field of segmentation, but the limitations of slow training speed and low segmentation accuracy are gradually emerging. To overcome these limitations, this study proposes an image segmentation recognition method based on the BM-TransUNet network, which is an improvement. A depth-separable convolution module is added to the first layer of the TransUNet network, and an attention mechanism module is introduced to the convolution layer of the encoder under-sampling so that the algorithm can better explore the features of the segmented objects. At the same time, a multi-scale feature fusion module, the feature pyramid network (FPN), is introduced between the decoder and encoder. In this study, a self-made posterior pharyngeal wall dataset is used for image segmentation training, and the effects of the trained BM-TransUNet network are compared with various traditional segmentation networks. Experimental results show that, compared to other traditional deep learning models, the identification method of the BM-TransUNet network exhibits higher classification accuracy and generalization ability, with Precision and Dice coefficient of 93.61% and 90.76%, respectively, showing better computational efficiency and effective in segmentation tasks.