Source Code Summarization Based on Neural Network and Information Retrieval
CSTR:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Source code summarization is designed to automatically generate precise summarization for natural language, so as to help developers better understand and maintain source code. Traditional research methods generate source code summaries by using information retrieval techniques, which select corresponding words from the original source code or adapt summaries of similar code snippets; recent research adopts machine translation methods and generates summaries of code snippets by selecting the encoder-decoder neural network model. However, there are two main problems in existing summarization generation methods: on the one hand, the neural network-based method is more friendly to the high-frequency words appearing in the code snippets, but it tends to weaken the processing of low-frequency words; on the other hand, programming languages ??are highly structured, so source code cannot simply be treated as serialized text, or otherwise, it will lead to loss of contextual structure information. Therefore, in order to solve the problem of low-frequency words, a retrieval-based neural machine translation approach is proposed. Similar code snippets retrieved from the training set are used to enhance the neural network model. In addition, to learn the structured semantic information of code snippets, this study proposes a structured-guided Transformer, which encodes structural information of codes through an attention mechanism. The experimental results show that the model has significant advantages over the deep learning model generated by the current cutting-edge code summarization in processing low-frequency words and structured semantics.

    Reference
    Related
    Cited by
Get Citation

沈鑫,周宇.基于神经网络和信息检索的源代码注释生成.计算机系统应用,2023,32(7):1-10

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:November 05,2022
  • Revised:December 10,2022
  • Adopted:
  • Online: May 12,2023
  • Published:
Article QR Code
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063