基于奇异值分解的适应微调

doi:10.15888/j.cnki.csa.009731

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年4月13日 1:12 星期日

首页 > 过刊浏览>2025年第34卷第1期 >276-284. DOI:10.15888/j.cnki.csa.009731

PDF HTML阅读 XML下载导出引用引用提醒

基于奇异值分解的适应微调
DOI:
                        10.15888/j.cnki.csa.009731
                    
CSTR:
                        32024.14.csa.009731
                    
作者:
                        林志鹏林志鹏
福建师范大学 计算机与网络空间安全学院, 福州 350109
在期刊界中查找
在百度中查找
在本站中查找
郭峥嵘郭峥嵘
福建师范大学 计算机与网络空间安全学院, 福州 350109
在期刊界中查找
在百度中查找
在本站中查找
张伟志张伟志
福建师范大学 计算机与网络空间安全学院, 福州 350109
在期刊界中查找
在百度中查找
在本站中查找
郭躬德郭躬德
福建师范大学 计算机与网络空间安全学院, 福州 350109
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学基金(61976053, 62171131)

Adaptation Fine-tuning Based on Singular Value Decomposition

Author:

LIN Zhi-Peng
LIN Zhi-Peng
College of Computer and Cyber Security, Fujian Normal University, Fuzhou 350109, China
在期刊界中查找
在百度中查找
在本站中查找
GUO Zheng-Rong
GUO Zheng-Rong
College of Computer and Cyber Security, Fujian Normal University, Fuzhou 350109, China
在期刊界中查找
在百度中查找
在本站中查找
ZHANG Wei-Zhi
ZHANG Wei-Zhi
College of Computer and Cyber Security, Fujian Normal University, Fuzhou 350109, China
在期刊界中查找
在百度中查找
在本站中查找
GUO Gong-De
GUO Gong-De
College of Computer and Cyber Security, Fujian Normal University, Fuzhou 350109, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

大语言模型的兴起对自然语言处理领域产生了深远影响. 随着计算资源的增长和模型规模的扩大, 大语言模型在自然语言处理中的应用潜力日益显现. 然而, 广泛使用的低秩适应微调方法在面对模型规模增大时, 遇到了微调效率和存储成本等方面的挑战. 为了解决这一问题, 本文提出了一种基于奇异值分解的适应微调方法. 该方法只需将奇异值分解得到的对角矩阵和缩放向量作为可训练参数, 从而在降低训练成本的同时, 实现了在多个自然语言处理任务上的性能提升. 实验结果显示, 基于奇异值分解的适应微调方法在GLUE和E2E基准测试中的性能超越了同等数量级的方法. 通过与常用的参数高效微调方法进行比较, 发现基于奇异值分解的适应微调方法在减少可训练参数数量和提高微调效率方面具有显著优势, 并在可训练参数微调效率实验中实现了最高的性能增益. 在未来的研究中, 将专注于进一步优化基于奇异值分解的适应微调方法, 在更广泛的任务和更大规模的模型中实现更高效的微调.

关键词:参数高效微调;生成式大模型;深度学习;领域适配;有限算力

Abstract:

The rise of large language models has profoundly impacted natural language processing. With the growth of computational resources and the expansion of model sizes, the potential applications of large language models in natural language processing are increasingly evident. However, the widely used low-rank adaptation (LoRA) method faces challenges related to fine-tuning efficiency and storage costs as model sizes increase. To address this issue, this study proposes a singular value decomposition-based adaptation fine-tuning method. This method only requires the diagonal matrix and scaling vector obtained from singular value decomposition to be trainable parameters, achieving performance improvement in multiple natural language processing tasks while reducing training costs. Experimental results show that the proposed method outperforms other methods of the same order of magnitude in GLUE and E2E benchmark tests. Compared with commonly used parameter-efficient fine-tuning methods, it demonstrates significant advantages in reducing the number of trainable parameters and improving fine-tuning efficiency, achieving the highest performance gains in experiments on the fine-tuning efficiency of trainable parameters. Future research will focus on optimizing the proposed method to achieve more efficient fine-tuning in a wider range of tasks and larger-scale models.

Key words:parameter efficient fine-tuning (PEFT);large generative model;deep learning;domain adaptation;limited computational resource

引用本文

林志鹏,郭峥嵘,张伟志,郭躬德.基于奇异值分解的适应微调.计算机系统应用,2025,34(1):276-284

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2024-06-06
最后修改日期:2024-07-10
录用日期:
在线发布日期: 2024-11-25
出版日期:

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码