###
计算机系统应用英文版:2021,30(8):256-265
本文二维码信息
码上扫一扫!
基于图卷积神经网络的函数自动命名
(北京化工大学 信息科学与技术学院, 北京 100029)
Automatic Function Naming Based on Graph Convolutional Network
(College of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 912次   下载 1607
Received:November 23, 2020    Revised:December 22, 2020
中文摘要: 函数自动命名技术旨在为输入的源代码自动生成目标函数名, 增强程序代码的可读性以及加速软件开发进程, 是软件工程领域中一项重要的研究任务. 现有基于机器学习的技术主要是通过序列模型对源代码进行编码, 进而自动生成函数名, 但存在长程依赖问题和代码结构编码问题. 为了更好的提取程序中的结构信息和语义信息, 本文提出了一个基于图卷积(Graph Convolutional Network, GCN)的神经网络模型—TrGCN (a Transformer and GCN based automatic method naming). TrGCN利用了Transformer中的自注意力机制来缓解长程依赖问题, 同时采用Character-word注意力机制提取代码的语义信息. TrGCN引入了一种基于图卷积的AST Encoder结构, 丰富了AST节点特征向量的信息, 可以很好地对源代码结构信息进行建模. 在实证研究中, 使用了3个不同规模的数据集来评估TrGCN的有效性, 实验结果表明TrGCN比当前广泛使用的模型code2seq和Sequence-GNNs能更好的自动生成函数名, 其中F1分数分别提高了平均5.2%、2.1%.
Abstract:Automatic method naming, as an important task in software engineering, aims to generate the target function name for an input source code to enhance the readability of program codes and accelerate software development. Existing automatic method naming approaches based on machine learning mainly encode the source code through sequence models to automatically generate the function name. However, these approaches are confronted with problems of long-term dependency and code structural encoding. To better extract structural and semantic information from programs, we propose a automatic function naming method called TrGCN based on Transformer and Graph Convolutional Network (GCN). In this method, the self-attention mechanism in Transformer is used to alleviate the long-term dependency and the Character-word attention mechanism to extract the semantic information of codes. The TrGCN introduces a GCN-based AST Encoder that enriches the eigenvector information at AST nodes and models the structural information of the source code well. Empirical studies are conducted on three Java datasets. The results show that TrGCN outperforms conventional approaches, namely code2seq and Sequence-GNNs, in automatic method naming as its F1-score is 5.2% and 2.1% higher than the values of the two approaches, respectively.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(61902015)
引用文本:
王堃,李征,刘勇.基于图卷积神经网络的函数自动命名.计算机系统应用,2021,30(8):256-265
WANG Kun,LI Zheng,LIU Yong.Automatic Function Naming Based on Graph Convolutional Network.COMPUTER SYSTEMS APPLICATIONS,2021,30(8):256-265