基于编码器-解码器架构大语言模型的关键句抽取
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:


Key Sentence Extraction Based on Encoder-decoder Architecture Large LanguageModel
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    关键句提取技术是指利用人工智能, 自动从一段长文本中寻找核心句. 该技术可用于信息检索的预处理, 对文本分类、抽取式摘要等下游任务有着重要意义. 传统的无监督关键句提取技术多数基于统计学以及图模型的方法, 存在着精度不高以及需要提前建立大规模语料库等问题. 本文提出了一种中文环境下的无监督提取关键句方法T5KSEChinese, 该方法利用编码器-解码器架构, 通过输入和输出提示词来忽略目标句与原文长度不匹配的问题, 以得到更准确的结果. 同时, 本文提出一种对比学习正样本构造方式, 并将该方式结合对比学习来对模型编码器部分进行半监督训练, 提升下游任务效果. 本研究使用轻量化的模型, 在无监督下游任务中得分优于参数量大于自身数十倍的大语言模型, 最终实验结果证明了提出方法的准确度和可靠性.

    Abstract:

    Key sentence extraction technology refers to using artificial intelligence to automatically find key sentences from a long text. This technology can be used for preprocessing information retrieval and is of great significance for downstream tasks such as text classification and extractive summarization. Traditional unsupervised key sentence extraction technologies are mostly based on statistics and graphical model methods, which have problems such as low accuracy and the need to build a large-scale corpus in advance. This study proposes T5KSEChinese, a method that can extract key sentences without supervision in the Chinese context. This method uses an encoder-decoder architecture to ignore the mismatch in length between the target sentence and the original text by inputting and outputting prompt words to obtain more accurate results. At the same time, a contrastive learning positive sample construction method is also proposed and combined with contrastive learning to conduct semi-supervised training on the encoder part of the model, which can improve the performance of downstream tasks. The method uses lightweight models to outperform the large language model with tens of times the number of parameters in the unsupervised downstream task. The final experimental results prove the accuracy and reliability of the proposed method.

    参考文献
    相似文献
    引证文献
引用本文

彭俊峰,俞凯,李国靖.基于编码器-解码器架构大语言模型的关键句抽取.计算机系统应用,,():1-10

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-07-16
  • 最后修改日期:2024-08-13
  • 录用日期:
  • 在线发布日期: 2024-12-19
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号