在线医疗问答文本的命名实体识别

doi:10.15888/j.cnki.csa.006760

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年4月24日 3:04 星期四

首页 > 过刊浏览>2019年第28卷第2期 >8-14. DOI:10.15888/j.cnki.csa.006760

PDF HTML阅读 XML下载导出引用引用提醒

在线医疗问答文本的命名实体识别
DOI:
                        10.15888/j.cnki.csa.006760
                    
CSTR:
                        
                    
作者:
                        杨文明杨文明
北京大学 软件与微电子学院, 北京 102600
在期刊界中查找
在百度中查找
在本站中查找
褚伟杰褚伟杰
北京大学 软件与微电子学院, 北京 102600
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:

Named Entity Recognition of Online Medical Question Answering Text

Author:

YANG Wen-Ming
YANG Wen-Ming
School of Software & Microelectronics, Peking University, Beijing 102600, China
在期刊界中查找
在百度中查找
在本站中查找
CHU Wei-Jie
CHU Wei-Jie
School of Software & Microelectronics, Peking University, Beijing 102600, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

本文主要是对在线问诊中产生的医疗文本进行命名实体识别的研究.使用在线医疗问答网站的数据，采用{B，I，O}标注体系构建数据集，抽取疾病、治疗、检查和症状四个医疗实体.以BiLSTM-CRF为基准模型，提出两种深度学习模型IndRNN-CRF和IDCNN-BiLSTM-CRF，并在自构建数据集上验证模型的有效性.将新提出的两种模型与基准模型通过实验对比得出：模型IDCNN-BiLSTM-CRF的F1值0.8116，超过了BiLSTM-CRF的F1值0.8009，IDCNN-BiLSTM-CRF整体性能好于BiLSTM-CRF模型；模型IndRNN-CRF的精确率0.8427，但该模型在召回率上低于基准模型BiLSTM-CRF.

关键词:医疗问答;深度学习;独立循环神经网络;膨胀卷积;双向循环神经网络

Abstract:

This paper mainly presents the research of named entity recognition of medical texts generated by online inquiry. Using the data of online medical quiz website, we employ {B, I, O} annotation system to build data sets, and extract four medical entities of disease, treatment, examination, and symptom. Taking BiLSTM-CRF as the benchmark model, two deep learning models IndRNN-CRF and IDCNN-BiLSTM-CRF are proposed, and the validity of the model on the self built dataset is verified. The two new models are compared with the benchmark model by experiment. It is concluded that the model IDCNN-BiLSTM-CRF has an F1 value of 0.8165, which exceeds the BiLSTM-CRF's F1 value of 0.8009. The overall performance of IDCNN-BiLSTM-CRF is better than that of BiLSTM-CRF. The IndRNN-CRF model has a high precision rate of 0.8427, but its recall rate is lower than the benchmark model BiLSTM-CRF.

Key words:medical question and answer;deep learning;Independent Recurrent Neural Network (IndRNN);dilation convolution;bi-directional RNN

引用本文

杨文明,褚伟杰.在线医疗问答文本的命名实体识别.计算机系统应用,2019,28(2):8-14

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2018-07-31
最后修改日期:2018-08-30
录用日期:
在线发布日期: 2019-01-28
出版日期: 2019-02-15

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码