面向生物医学命名实体识别和规范化的多粒度特征融合

doi:10.15888/j.cnki.csa.009640

微信公众号

网站二维码

首页 > 过刊浏览>2024年第33卷第11期 >237-246. DOI:10.15888/j.cnki.csa.009640

PDF HTML阅读 XML下载导出引用引用提醒

面向生物医学命名实体识别和规范化的多粒度特征融合
DOI:
                        10.15888/j.cnki.csa.009640
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:科技创新2030—“新一代人工智能”重大项目(2022ZD0119500); 山东省自然科学基金(ZR2022MF319); 山东科技大学青年教师教学拔尖人才培养基金(BJ20211110)

Multi-granularity Feature Fusion for Biomedical Named Entity Recognition and Normalization

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

为了从生物医学文献中提取丰富的实体信息及其规范化表达, 提出了一种面向生物医学命名实体和规范化的多粒度特征融合方法(multi-granularity feature fusion approach for biomedical named entity recognition and normalization, MGFFA). 通过整合字符级、词级、概念级的文本信息, 显著增强了模型的学习能力. 同时还包含一个用于存储和综合不同层次信息的记忆库, 以实现对实体及其规范化标签间复杂关系的深入理解. 通过预训练模型的配合使用, MGFFA不仅捕捉了文本的粗粒度语义表示, 还细致分析了构词层面的特征, 从而全面提升了对长跨度实体的识别准确率. 在NCBI和NC5CDR数据集上的实验结果显示, 该模型在总体上优于其他基线模型.

Abstract:

To extract rich entity information and normalized expressions from biomedical literature, this study proposes a multi-granularity feature fusion approach for biomedical named entity recognition and normalization (MGFFA). By integrating character-level, word-level, and concept-level textual information, the model significantly enhances its learning capability. It also incorporates a memory bank for storing and synthesizing information from different levels to achieve a deeper understanding of the complex relationships between entities and their normalized labels. With the integration of pre-trained models, MGFFA captures not only coarse-grained semantic representations of text but also conducts detailed analysis at the morphological level, thereby comprehensively improving the recognition accuracy of long-span entities. Experimental results on the NCBI and NC5CDR datasets demonstrate that the model outperforms other baseline models overall.

参考文献

相似文献

引证文献

引用本文

刘彤,石昌岭,倪维健.面向生物医学命名实体识别和规范化的多粒度特征融合.计算机系统应用,2024,33(11):237-246

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2024-04-01
最后修改日期:2024-04-29
录用日期:
在线发布日期: 2024-09-27
出版日期:

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码