基于主题提示的电力命名实体识别
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国网江苏省电力有限公司科技项目(J2021151)


Electric Power Named Entity Recognition Based on Topic Prompt
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 增强出版
  • |
  • 文章评论
    摘要:

    传统的命名实体识别方法可以凭借充足的监督数据实现较好的识别效果. 而在针对电力文本的命名实体识别中, 由于对专业知识的依赖, 往往很难获取足够的监督数据, 即存在少样本场景. 同时, 由于电力行业的精确性要求, 相比于一般的开放领域任务, 电力领域的实体类型更多, 因此难度更大. 针对这些挑战, 本文提出了一个基于主题提示的命名实体识别方法. 该方法将每个实体类型视为一个主题, 并使用主题模型从训练语料中获取与类型相关的主题词. 通过枚举实体跨度、实体类型、主题词以填充模板并构建提示句. 使用生成式预训练语言模型对提示句排序, 最终识别出实体与对应类型标签. 实验结果表明, 在中文电力命名实体识别数据集上, 相比于几种传统命名实体方法, 基于主题提示的方法取得了更好的效果.

    Abstract:

    Traditional named entity recognition methods can achieve favorable results owing to sufficient supervision data. As far as named entity recognition from electric power texts is concerned, however, the dependence on professional knowledge often makes it difficult to obtain sufficient supervision data, which is also known as a few-shot scenario. In addition, electric power named entity recognition is more challenging than general open domain tasks due to the accuracy requirements of the electric power industry and the more categories of entities in this industry. To overcome these challenges, this study proposes a named entity recognition method based on topic prompts. This method regards each entity category as a topic and uses the topic model to obtain topic words related to the category from the training corpus. Then, it fills in the template and constructs prompt sentences by enumerating entity spans, entity categories, and topic terms. Finally, the generative pre-trained language model is used to rank the prompt sentences and ultimately identify the entity and the corresponding category label. The experimental results show that on the dataset of Chinese electric power named entities to be recognized, the proposed method achieves better results than those offered by several traditional named entity recognition methods.

    参考文献
    相似文献
    引证文献
引用本文

康雨萌,何玮,翟千惠,程雅梦,俞阳.基于主题提示的电力命名实体识别.计算机系统应用,2022,31(9):272-279

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2021-12-16
  • 最后修改日期:2022-01-13
  • 录用日期:
  • 在线发布日期: 2022-06-28
  • 出版日期:
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号