###

计算机系统应用英文版:2024,33(7):103-111

View/Add Comment 过刊浏览高级检索 HTML

←前一篇 | 后一篇→

码上扫一扫！

下载全文

基于SimCSE框架融合预训练模型层级特征的文本匹配

盛成城¹, 陈进东^2,3, 张健^2,3

(1.北京信息科技大学计算机学院, 北京 100192;2.北京信息科技大学经济管理学院, 北京 100192;3.智能决策与大数据应用北京市国际科研合作基地, 北京 100192)

Text Matching Based on SimCSE Framework Fused with Pre-trained Model Internal Hierarchical Features

SHENG Cheng-Cheng¹, CHEN Jin-Dong^2,3, ZHANG Jian^2,3

(1.Computer School, Beijing Information Science and Technology University, Beijing 100192, China;2.School of Economics and Management, Beijing Information Science and Technology University, Beijing 100192, China;3.Beijing International Science and Technology Cooperation Base for Intelligent Decision and Big Data Application, Beijing 100192, China)

摘要

图/表

参考文献

相似文献

本文已被：浏览 220次下载 626次
Received:January 10, 2024 Revised:February 07, 2024

中文摘要: SimCSE框架仅使用分类令牌[CLS]token作为文本向量, 同时忽略基座模型内层级信息, 导致对基座模型输出语义特征提取不充分. 本文基于SimCSE框架提出一种融合预训练模型层级特征方法SimCSE-HFF (SimCSE with hierarchical feature fusion, SimCSE-HFF). SimCSE-HFF基于双路并行网络, 使用短路径和长路径强化特征学习, 短路径使用卷积神经网络学习文本局部特征并进行降维, 长路径使用双向门控循环神经网络学习深度语义信息, 同时在长路径中利用自编码器融合基座模型内部其他层特征, 解决模型对输出特征提取不充分的问题. 在STS-B的中文与英文数据集上, SimCSE-HFF方法效果在语义相似度Spearman和Pearson相关性指标上优于传统方法, 在不同预训练模型上均得到提升; 在下游任务检索问答上也优于SimCSE框架, 具有更优秀的通用性.

中文关键词: 文本匹配 SimCSE 特征融合自编码器并行网络

Abstract:The simple contrastive learning of sentence embedding (SimCSE) framework only uses the classification [CLS]tokens as text vectors, and it also neglects the hierarchical information within the base model, which results in insufficient extraction of semantic features from the base model output. Based on the SimCSE framework, this study proposes a method that fuses hierarchical features of pre-trained models, SimCSE with hierarchical feature fusion (SimCSE-HFF). SimCSE-HFF is based on a dual-path parallel network, using short and long paths to strengthen feature learning. The short path uses a convolutional neural network to learn local text features and perform dimensionality reduction, while the long path uses a bidirectional gated recurrent neural network to learn deep semantic information. Additionally, in the long path, an autoencoder is used to fuse features from other layers within the base model, solving the problem of insufficient extraction of output features by the model. On the Chinese and English datasets of spring tools suite-bundle (STS-B), the SimCSE-HFF method outperforms traditional methods in terms of semantic similarity Spearman and Pearson correlation metrics, showing improvements on different pre-trained models. Additionally, it also outperforms the SimCSE framework in downstream task retrieval-based question answering, demonstrating better versatility.

keywords: text matching SimCSE feature fusion autoencoder parallel network

文章编号： 中图分类号： 文献标志码：

基金项目:国家重点研发计划(2019YFB1405303); 北京市属高等学校优秀青年人才培育计划(BPHR202203233); 国家自然科学基金面上项目(72174018)

引用文本：
盛成城,陈进东,张健.基于SimCSE框架融合预训练模型层级特征的文本匹配.计算机系统应用,2024,33(7):103-111
SHENG Cheng-Cheng,CHEN Jin-Dong,ZHANG Jian.Text Matching Based on SimCSE Framework Fused with Pre-trained Model Internal Hierarchical Features.COMPUTER SYSTEMS APPLICATIONS,2024,33(7):103-111

Author Name	Affiliation	E-mail
SHENG Cheng-Cheng	Computer School, Beijing Information Science and Technology University, Beijing 100192, China
CHEN Jin-Dong	School of Economics and Management, Beijing Information Science and Technology University, Beijing 100192, China Beijing International Science and Technology Cooperation Base for Intelligent Decision and Big Data Application, Beijing 100192, China	j.chen@bistu.edu.cn
ZHANG Jian	School of Economics and Management, Beijing Information Science and Technology University, Beijing 100192, China Beijing International Science and Technology Cooperation Base for Intelligent Decision and Big Data Application, Beijing 100192, China

Author Name	Affiliation	E-mail
SHENG Cheng-Cheng	Computer School, Beijing Information Science and Technology University, Beijing 100192, China
CHEN Jin-Dong	School of Economics and Management, Beijing Information Science and Technology University, Beijing 100192, China Beijing International Science and Technology Cooperation Base for Intelligent Decision and Big Data Application, Beijing 100192, China	j.chen@bistu.edu.cn
ZHANG Jian	School of Economics and Management, Beijing Information Science and Technology University, Beijing 100192, China Beijing International Science and Technology Cooperation Base for Intelligent Decision and Big Data Application, Beijing 100192, China