###
计算机系统应用英文版:2023,32(2):364-370
本文二维码信息
码上扫一扫!
基于自注意力机制模拟实体信息的实体关系抽取
(成都信息工程大学 计算机学院, 成都 610225)
Entity Relation Extraction Simulation of Entity Information Based on Self-attention Mechanism
(School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 643次   下载 1653
Received:June 30, 2022    Revised:July 29, 2022
中文摘要: 在信息抽取领域, 从非结构化文本中抽取实体关系是一项基础且重要的任务, 且面临实体重叠和模型误差累积等挑战. 本文以关系为导向, 提出一种改进的实体关系联合抽取方法. 该方法将实体关系抽取任务分为关系抽取与实体抽取两个子任务. 在关系抽取任务上采用自注意力机制关注词与词之间的重要程度从而模拟实体信息, 并使用平均池化来表征整个句子信息; 在实体抽取任务上结合关系信息使用条件随机场识别该关系下的实体对. 本模型不仅能够利用存在关系必定存在实体对的思想解决实体对重叠问题, 还能够在训练过程中利用数据集中已知的关系使实体抽取模块不依赖于关系抽取模块的结果来训练, 从而在训练阶段避免误差累积. 最后, 在WebNLG和NYT公开数据集上验证了该模型的有效性.
Abstract:In the field of information extraction, it is a basic and important task to extract entity relations from unstructured texts, and challenges such as entity overlap and model error accumulation often appear. This study is relation-oriented, and it proposes an improved joint extraction method for entity relations. The method divides the entity relation extraction task into two subtasks: relation extraction and entity extraction. For the relation extraction subtask, a self-attention mechanism is adopted to evaluate the degree of association between words, so as to simulate entity information and represent the whole sentence information by the average pooling. For the entity extraction subtask, according to relation information, the conditional random field is used to identify the entity pairs under the relation. This method can not only solve the problem of entity overlap by using the idea that relation and entity pairs coexist but also perform training by using the known relation in the dataset to make the entity extraction module independent from the results of the relation extraction module during the training, so as to avoid error accumulation. Finally, the effectiveness of the model is verified on the public datasets of WebNLG and NYT.
文章编号:     中图分类号:    文献标志码:
基金项目:四川省科技厅重点研发项目(2021YFG0031, 2022YFG0375); 四川省科技服务业示范项目(2021GFW130); 2022年度大学生创业训练计划(202210621196, 202210621073k)
引用文本:
何松泽,王婷,梁佳莹,陈永雄,戴青江.基于自注意力机制模拟实体信息的实体关系抽取.计算机系统应用,2023,32(2):364-370
HE Song-Ze,WANG Ting,LIANG Jia-Ying,CHEN Yong-Xiong,DAI Qing-Jiang.Entity Relation Extraction Simulation of Entity Information Based on Self-attention Mechanism.COMPUTER SYSTEMS APPLICATIONS,2023,32(2):364-370