基于知识增强的中文电子病历命名实体识别

doi:10.15888/j.cnki.csa.009322

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年4月24日 4:29 星期四

首页 > 过刊浏览>2023年第32卷第12期 >112-119. DOI:10.15888/j.cnki.csa.009322

PDF HTML阅读 XML下载导出引用引用提醒

基于知识增强的中文电子病历命名实体识别
DOI:
                        10.15888/j.cnki.csa.009322
                    
CSTR:
                        
                    
作者:
                        李宛泽李宛泽
青岛科技大学 信息科学技术学院, 青岛 266061
在期刊界中查找
在百度中查找
在本站中查找
宋波宋波
青岛科技大学 信息科学技术学院, 青岛 266061
在期刊界中查找
在百度中查找
在本站中查找
齐岳山齐岳山
青岛科技大学 信息科学技术学院, 青岛 266061
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:

Knowledge-enhanced Named Entity Recognition for Chinese Electronic Medical Records

Author:

LI Wan-Ze
LI Wan-Ze
School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China
在期刊界中查找
在百度中查找
在本站中查找
SONG Bo
SONG Bo
School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China
在期刊界中查找
在百度中查找
在本站中查找
QI Yue-Shan
QI Yue-Shan
School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

针对中文电子病历中医疗嵌套实体难以处理的问题, 本文基于RoBERTa-wwm-ext-large预训练模型提出一种知识增强的中文电子病历命名实体识别模型ERBEGP. RoBERTa-wwm-ext-large采用的全词掩码策略能够获得词级别的语义表示, 更适用于中文文本. 首先结合知识图谱, 使模型学习到了大量的医疗实体名词, 进一步提高模型对电子病历实体识别的准确性. 然后通过BiLSTM对电子病历输入序列编码, 能够更好捕获病历的中上下语义信息. 最后利用全局指针网络模型EGP (efficient GlobalPointer)同时考虑实体的头部和尾部的特征信息来预测嵌套实体, 更加有效地解决中文电子病历命名实体识别任务中嵌套实体难以处理的问题. 在CBLUE中的4个数据集上本文方法均取得了更好的识别效果, 证明了ERBEGP模型的有效性.

关键词:中文电子病历;命名实体识别;知识增强;嵌套实体;全局指针网络模型;深度学习

Abstract:

Regarding the challenge of handling nested medical entities in Chinese electronic medical records, this study proposes a knowledge-enhanced named entity recognition model for Chinese electronic medical records called ERBEGP based on the RoBERTa-wwm-ext-large pre-trained model. The comprehensive word masking strategy employed by the RoBERTa-wwm-ext-large model can obtain semantic representations at the word level, which is more suitable for Chinese texts. First, the model learns a significant number of medical entity nouns by integrating knowledge graphs, further improving entity recognition accuracy in electronic medical records. Then, the contextual semantic information within the records can be better captured through BiLSTM encoding of the input sequence of medical records. Finally, the efficient GlobalPointer (EGP) model is adopted to simultaneously consider the features of both the head and tail of entities to predict nested entities, addressing the challenge of handling nested entities in named entity recognition tasks of Chinese electronic medical records. The effectiveness of the ERBEGP model is demonstrated by yielding better recognition results on the four datasets within CBLUE.

Key words:Chinese electronic medical records;named entity recognition (NER);knowledge enhancement;nested entities;global pointer network model;deep learning

引用本文

李宛泽,宋波,齐岳山.基于知识增强的中文电子病历命名实体识别.计算机系统应用,2023,32(12):112-119

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2023-05-22
最后修改日期:2023-06-28
录用日期:
在线发布日期: 2023-09-22
出版日期:

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码