Key Sentence Extraction Based on Encoder-decoder Architecture Large LanguageModel

doi:10.15888/j.cnki.csa.009764

WeChat

Mobile website

Home > Archive>Volume , Issue , >1-10. DOI:10.15888/j.cnki.csa.009764

PDF HTML XML Export Cite reminder

Key Sentence Extraction Based on Encoder-decoder Architecture Large LanguageModel
DOI:
                        10.15888/j.cnki.csa.009764
                    
CSTR:
                        
Author:
                        
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Key sentence extraction technology refers to using artificial intelligence to automatically find key sentences from a long text. This technology can be used for preprocessing information retrieval and is of great significance for downstream tasks such as text classification and extractive summarization. Traditional unsupervised key sentence extraction technologies are mostly based on statistics and graphical model methods, which have problems such as low accuracy and the need to build a large-scale corpus in advance. This study proposes T5KSEChinese, a method that can extract key sentences without supervision in the Chinese context. This method uses an encoder-decoder architecture to ignore the mismatch in length between the target sentence and the original text by inputting and outputting prompt words to obtain more accurate results. At the same time, a contrastive learning positive sample construction method is also proposed and combined with contrastive learning to conduct semi-supervised training on the encoder part of the model, which can improve the performance of downstream tasks. The method uses lightweight models to outperform the large language model with tens of times the number of parameters in the unsupervised downstream task. The final experimental results prove the accuracy and reliability of the proposed method.

Reference

Cited by

Get Citation

彭俊峰,俞凯,李国靖.基于编码器-解码器架构大语言模型的关键句抽取.计算机系统应用,,():1-10

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:July 16,2024
Revised:August 13,2024
Adopted:
Online: December 19,2024
Published:

Article QR Code

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063