Text Matching Model Based on Incremental Pre-training and Adversarial Training

doi:10.15888/j.cnki.csa.008778

WeChat

Mobile website

Home > Archive>Volume 31, Issue 11, 2022 >349-357. DOI:10.15888/j.cnki.csa.008778

PDF HTML XML Export Cite reminder

Text Matching Model Based on Incremental Pre-training and Adversarial Training
DOI:
                        10.15888/j.cnki.csa.008778
                    
CSTR:
                        [cstr]
                    
Author:
                        
                        
                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Text matching is one of the key techniques in natural language understanding, and its task is to determine the similarity of two texts. In recent years, with the development of pre-trained models, text-matching techniques based on pre-trained language models have been widely used. However, these text matching models still face the challenges of poor generalization ability in a particular domain and weak robustness in semantic matching. Therefore, this study proposes an incremental pre-training and adversarial training method for low-frequency words to improve the effect of the text matching model. The incremental pre-training of low-frequency words in the source domain helps the model migrate to the target domain and enhances the generalization ability of the model. Additionally, various adversarial training methods for low-frequency words are tried to improve the model’s adaptability to word-level perturbations and the robustness of the model. The experimental results on the LCQMC dataset and the text-matching dataset in the real estate domain indicate that incremental pre-training, adversarial training, and the combination of the two approaches can significantly improve the text matching results.

Reference

Cited by

Get Citation

司志博文,李少博,单丽莉,孙承杰,刘秉权.基于增量预训练和对抗训练的文本匹配模型.计算机系统应用,2022,31(11):349-357

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:January 29,2022
Revised:February 24,2022
Adopted:
Online: July 29,2022
Published:

Article QR Code

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063