###
计算机系统应用英文版:2024,33(11):237-246
本文二维码信息
码上扫一扫!
面向生物医学命名实体识别和规范化的多粒度特征融合
(山东科技大学 计算机科学与工程学院, 青岛 266590)
Multi-granularity Feature Fusion for Biomedical Named Entity Recognition and Normalization
(School of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 25次   下载 378
Received:April 01, 2024    Revised:April 29, 2024
中文摘要: 为了从生物医学文献中提取丰富的实体信息及其规范化表达, 提出了一种面向生物医学命名实体和规范化的多粒度特征融合方法(multi-granularity feature fusion approach for biomedical named entity recognition and normalization, MGFFA). 通过整合字符级、词级、概念级的文本信息, 显著增强了模型的学习能力. 同时还包含一个用于存储和综合不同层次信息的记忆库, 以实现对实体及其规范化标签间复杂关系的深入理解. 通过预训练模型的配合使用, MGFFA不仅捕捉了文本的粗粒度语义表示, 还细致分析了构词层面的特征, 从而全面提升了对长跨度实体的识别准确率. 在NCBI和NC5CDR数据集上的实验结果显示, 该模型在总体上优于其他基线模型.
Abstract:To extract rich entity information and normalized expressions from biomedical literature, this study proposes a multi-granularity feature fusion approach for biomedical named entity recognition and normalization (MGFFA). By integrating character-level, word-level, and concept-level textual information, the model significantly enhances its learning capability. It also incorporates a memory bank for storing and synthesizing information from different levels to achieve a deeper understanding of the complex relationships between entities and their normalized labels. With the integration of pre-trained models, MGFFA captures not only coarse-grained semantic representations of text but also conducts detailed analysis at the morphological level, thereby comprehensively improving the recognition accuracy of long-span entities. Experimental results on the NCBI and NC5CDR datasets demonstrate that the model outperforms other baseline models overall.
文章编号:     中图分类号:    文献标志码:
基金项目:科技创新2030—“新一代人工智能”重大项目(2022ZD0119500); 山东省自然科学基金(ZR2022MF319); 山东科技大学青年教师教学拔尖人才培养基金(BJ20211110)
引用文本:
刘彤,石昌岭,倪维健.面向生物医学命名实体识别和规范化的多粒度特征融合.计算机系统应用,2024,33(11):237-246
LIU Tong,SHI Chang-Ling,NI Wei-Jian.Multi-granularity Feature Fusion for Biomedical Named Entity Recognition and Normalization.COMPUTER SYSTEMS APPLICATIONS,2024,33(11):237-246