基于BERT-BiLSTM-CRF模型的中文实体识别

doi:10.15888/j.cnki.csa.007525

微信公众号

网站二维码

首页 > 过刊浏览>2020年第29卷第7期 >48-55. DOI:10.15888/j.cnki.csa.007525

PDF HTML阅读 XML下载导出引用引用提醒

基于BERT-BiLSTM-CRF模型的中文实体识别增强出版
DOI:
                        10.15888/j.cnki.csa.007525
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:安徽省自然科学基金(1908085MF202); 国防科技大学校基金(ZK18-03-14)

Chinese Entity Recognition Based on BERT-BiLSTM-CRF Model

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

增强出版

文章评论

摘要:

命名实体识别是自然语言处理的一项关键技术. 基于深度学习的方法已被广泛应用到中文实体识别研究中. 大多数深度学习模型的预处理主要注重词和字符的特征抽取, 却忽略词上下文的语义信息, 使其无法表征一词多义, 因而实体识别性能有待进一步提高. 为解决该问题, 本文提出了一种基于BERT-BiLSTM-CRF模型的研究方法. 首先通过BERT模型预处理生成基于上下文信息的词向量, 其次将训练出来的词向量输入BiLSTM-CRF模型做进一步训练处理. 实验结果表明, 该模型在MSRA语料和人民日报语料库上都达到相当不错的结果, F1值分别为94.65%和95.67%.

Abstract:

Named Entity Recognition is a key technology in natural language processing, and the methods based on deep learning have been widely used in Chinese entity recognition. Most deep learning models focus on the feature extraction of words and characters, but ignore the semantic information of word context, therefore, they cannot represent polysemy, and the performance of entity recognition needs to be further improved. In order to solve this problem, this study proposes a method based on the BERT-BiLSTM-CRF model. First, word vectors based on context information are generated by the pretreatment of BERT model, and then the trained word vector is input into BiLSTM-CRF model for further training. The experimental result shows that the proposed model achieves sound results and reaches F1-score of 94.65% and 95.67% respectively in the MSRA corpus and People’s Daily.

参考文献

相似文献

引证文献

引用本文

谢腾,杨俊安,刘辉.基于BERT-BiLSTM-CRF模型的中文实体识别.计算机系统应用,2020,29(7):48-55

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2019-12-22
最后修改日期:2020-01-19
录用日期:
在线发布日期: 2020-07-04
出版日期:

微信公众号

网站二维码

引用本文

分享

文章指标

历史