Abstract:Traditional terminology standardization schemes based on template matching, artificially constructed features, semantic matching, etc., are often faced with problems such as low terminology mapping accuracy and difficult alignment. Given the colloquial and diverse expression of terminology in medical texts, modules of multi-strategy recall and implication semantic score ranking are used to improve the effect of medical terminology standardization. In the multi-strategy recall module, the recall method based on the Jaccard correlation coefficient, term frequency-inverse document frequency (TF-IDF), and historical recalls is employed. In the implication semantic scoring module, RoBERTa-wwm-ext is adopted as the scoring semantic model. The usability of the proposed method is validated for the first time on a Chinese dataset that is based on the systematized nomenclature of medicine-clinical terms (SNOMED CT) standard and annotated by medical professionals. Experiments show that in the processing of medical knowledge features, the proposed method can achieve favorable results in practical applications of medical terminology standardization and has high generalization and practical value.