Abstract:The AI diagnostic model based on deep learning relies heavily on high-quality detailed annotated data for algorithm training, but is affected by label noise information. To enhance the robustness of the model and prevent noisy label memory, a noise label sample selection (NLSS) model is proposed to fully mine the hidden information of noise samples and alleviate model overfitting. Firstly, distributed feature representations of the image are extracted by taking hybrid enhanced images as input. Secondly, the contrasive loss function is introduced to compare the similarity between the predicted label distribution of the sample and the real label distribution for sample evaluation and selection. Finally, based on sample selection, supervised information of the noisy label is re-corrected by the pseudo-label promotion strategy of the label redistribution module. Taking the PET/CT dataset of non-small cell lung cancer (NSCLC) patients as an example, results show that the proposed models outperform comparison models, reducing the interference of label noise in the diagnosis of lymph node metastasis.