本文已被:浏览 454次 下载 1724次
Received:February 14, 2023 Revised:March 14, 2023
Received:February 14, 2023 Revised:March 14, 2023
中文摘要: 源代码迁移技术旨在将源代码从一种编程语言转换至另一种编程语言, 以减轻开发人员迁移软件项目的负担. 现有研究通常利用神经机器翻译(NMT)模型将源代码转换为目标代码, 但这些研究忽略了代码结构特征, 导致源代码迁移性能不佳. 为此, 本文提出了基于代码语句掩码注意力机制的源代码迁移模型CSMAT (code-statement masked attention Transformer). 该模型利用Transformer的掩码注意力机制(masked attention mechanism), 在编码时引导模型理解源代码语句的语法和语义以及语句间上下文特征, 在译码时引导模型关注并对齐源代码语句, 从而提升源代码迁移性能. 本文使用真实项目数据集CodeTrans进行实证研究, 并使用4个指标评估模型性能. 实验结果验证了CSMAT的有效性, 同时验证了代码语句掩码注意力机制在预训练模型的适用性.
Abstract:Source code migration techniques are designed to convert source code from one programming language to another, which helps reduce developers’ burden in migrating software projects. Existing studies mainly use neural machine translation (NMT) models to convert source code to target code. However, these studies ignore the code structure features, resulting in poor source code migration performance. Therefore, this study proposes a source code migration model based on a code-statement masked attention Transformer (CSMAT). The model uses Transformer’s masked attention mechanism to guide the model to understand the syntax and semantics of source code statements and inter-statement contextual features when encoding and make the model focus on and align the source code statements when decoding, so as to improve migration performance of source code. Empirical studies are conducted on the real project dataset, namely CodeTrans, and model performance is evaluated by using four metrics. The experimental results have validated the effectiveness of CSMAT and the applicability of the code-statement masked attention mechanism to pre-trained models.
文章编号: 中图分类号: 文献标志码:
基金项目:国家自然科学基金(61902015, 61872026)
引用文本:
徐明瑞,李征,刘勇,吴永豪.基于代码语句掩码注意力机制的源代码迁移模型.计算机系统应用,2023,32(9):77-88
XU Ming-Rui,LI Zheng,LIU Yong,WU Yong-Hao.Source Code Migration Model Based on Code-statement Masked Attention Mechanism.COMPUTER SYSTEMS APPLICATIONS,2023,32(9):77-88
徐明瑞,李征,刘勇,吴永豪.基于代码语句掩码注意力机制的源代码迁移模型.计算机系统应用,2023,32(9):77-88
XU Ming-Rui,LI Zheng,LIU Yong,WU Yong-Hao.Source Code Migration Model Based on Code-statement Masked Attention Mechanism.COMPUTER SYSTEMS APPLICATIONS,2023,32(9):77-88