Abstract:In response to the key information blur in images and poor adaptability in the gastrointestinal endoscopy diagnosis and treatment system, this study proposes a cycle generative adversarial network (CycleGAN) combining an improved attention mechanism to accurately estimate the depth information of the digestive tract. Based on CycleGAN, the network combines a dual attention mechanism and introduces a residual gate mechanism and a non-local module to comprehensively capture and understand the feature structure and global correlation of input data, thereby improving the quality and adaptation of depth image generation. Meanwhile, a dual-scale feature fusion network is employed as the discriminator to improve the discrimination ability and balance the working performance between the generator and the discriminator. Experimental results show that the proposed method yields good prediction performance in the gastrointestinal endoscopy scenes. Its average accuracy of the stomach, small intestine, and colon datasets is improved by 7.39%, 10.17%, and 10.27% respectively compared with other unsupervised methods. Additionally, it can accurately estimate the relative depth information and provide accurate boundary information in the laboratory human gastric organ model.