Binary Basic Block Similarity Detection Based on Word Order Embeddings
CSTR:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Neural machine translation technology can translate the semantic information of multiple languages automatically. Therefore, it has been applied to binary code similarity detection of cross-instruction set architecture successfully. When the sequences of assembly instructions are treated as sequences of textual tokens, the order of instructions is important. When binary basic block-level similarity detection is performed, the neural networks model instruction positions with position embeddings, but it failed to reflect the ordered relationships (e.g., adjacency or precedence) between instructions. To address this problem, this study uses a continuous function of instruction positions to model the global absolute positions and ordered relationships of assembly instructions, achieving the generalization of word order embeddings. Firstly, the source instruction set architecture (ISA) encoder is constructed by Transformer. Secondly, the target ISA encoder is trained by triplet loss, and the source ISA encoder is fine-tuned. Finally, the Euclidean distances between embedding vectors are mapped to [0,1], which are used as the similarity metrics between basic blocks. The experimental results on the public dataset MISA show that the evaluation metric P@1 of this study is 69.5%, which is 4.6% higher than the baseline method MIRROR.

    Reference
    Related
    Cited by
Get Citation

李涛,王金双,周振吉.基于词序嵌入的二进制基本块相似性检测.计算机系统应用,2023,32(12):253-260

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:May 05,2023
  • Revised:June 06,2023
  • Adopted:
  • Online: September 21,2023
  • Published:
Article QR Code
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063