计算机系统应用  2020, Vol. 29 Issue (6): 39-46 PDF

Network Packet Intrusion Detection Method Based on CNN and SVM
XU Xue-Li, DUAN Juan, XIAO Chuang-Bai, ZHANG Bin
Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
Abstract: In order to further improve the accuracy of network anomaly detection, based on the analysis of existing intrusion detection methods, this study proposes a network packets intrusion detection method based on Convolutional Neural Networks (CNN) and Support Vector Machine (SVM). The method first preprocesses the data into a two-axis matrix. In order to prevent the algorithm model from over-fitting, the permutation function is used to randomly shuffle the data, and then the CNN is used to learn the effective features from the pre-processed data. Finally, this method uses SVM classifier to classify the vectors. In the dataset selection, we use the authoritative dataset commonly used in network intrusion detection—Kyoto University honeypot system dataset. This method proposed in this study is compared with the existing models with high detection rates, such as GRU-Softmax and GRU-SVM. The model has improved the highest accuracy by 19.39% and 12.83% respectively, which further improves the accuracy of network anomaly detection. At the same time, the method has greatly improved the training speed and test speed.
Key words: intrusion detection     Convolutional Neural Networks (CNN)     Support Vector Machine (SVM)     text classification     deep learning

1 引言

2 相关工作

3 基于CNN和SVM的报文入侵检测方法 3.1 数据预处理

3.2 CNN-SVM模型架构

3.3 CNN-SVM算法原理

 图 1 CNN-SVM模型结构图

 $ReLU(x) = \;\left\{ {\begin{array}{*{20}{c}} {0,}&{{\rm if}\;x \le \;0} \\ {x,}&{{\rm if}\;x\; > \;0} \end{array}} \right.$ (1)

 ${{\min}}\dfrac{1}{{{n}}}||{{w}}||_2^2 + C\displaystyle\sum\limits_{i = 1}^n {\max {{(0,1 - {{y'}_{{i}}}({w^{\rm T}}{x_i} + b))}^2}}$ (2)

4 实验结果与分析

4.1 评价指标

 $TPR = \dfrac{{TP}}{{TP + FN}}$ (3)
 $TNR = \dfrac{{TN}}{{TN + FP}}$ (4)
 $FPR = \dfrac{{FP}}{{FP + TN}}$ (5)
 $FNR = \dfrac{{FN}}{{FN + TP}}$ (6)

 $accr = \dfrac{{TP + TN}}{{TP + FN + FP + TN}}$ (7)
 $recall = \dfrac{{TP}}{{TP + FN}}$ (8)
 $precision = \dfrac{{TP}}{{TP + FP}}$ (9)
 $e{{rr}}or = \dfrac{{FP + FN}}{{TP + FN + FP + TN}}$ (10)

4.2 实验结果

3种模型的训练时间如表10所示, 其中本文模型CNN-SVM的训练和测试时间都优于其它两个模型. 3种模型的在训练数据集中的准确率和测试数据集中的准确率如图2图3所示, 从示意图中能够看出, 本文模型在训练数据以及测试数据上准确率都高于其它两种模型.

 图 2 3种模型在训练数据集中的准确率对比

 图 3 3种模型在测试数据集中的准确率对比

3种模型训练时的损失变化曲线如图4图5图6所示.

 图 4 CNN-SVM模型训练的损失变化曲线

 图 5 GRU-SVM模型训练的损失变化曲线

 图 6 GRU-Softmax模型训练的损失变化曲线

5 总结与展望

