Abstract:To improve the accuracy of predicting and diagnosing glaucoma and avoid the accumulation of errors caused by manual screening, this study proposes an automatic glaucoma screening method guided by position attention. The proposed method includes two parts: attention prediction of fundus images and glaucoma disease classification. First, a U-shaped network based on the combination of deep understanding convolution kernels and channel excitation connection spatial pyramids is proposed to predict the attention of fundus images. Feature maps in the decoding process are used as spatial information to guide glaucoma classification. Second, a position attention mechanism used in the glaucoma classification model is proposed, which combines channel information and spatial information from different sources to dynamically adjust the feature maps from external encoders. The main branch of the glaucoma classification model stacks multiple position attention modules and residual modules to fulfill the classification task. At the same time, an auxiliary branch for segmentation tasks is designed to assist in model training and optimization to improve classification accuracy. The precision, recall, and AUC of the proposed method based on the glaucoma LAG dataset test reach 97.84%, 97.75%, and 98.57% respectively, which outperform all the comparative models. The model decision attention area obtained by visualizing the attention activation heat map is more accurate, assisting in locating the lesions in clinical diagnosis and providing an effective reference for the results of clinical diagnosis.