本文已被:浏览 662次 下载 1557次
Received:June 07, 2021 Revised:July 07, 2021
Received:June 07, 2021 Revised:July 07, 2021
中文摘要: 在基于网络流量分析, 被动式的网络设备识别研究中, 网络流量数据中往往存在许多高维数据, 其中的部分特征对设备识别贡献不大, 甚至会严重影响分类结果和分类性能. 所以针对这个问题本文提出了一种将Filter和Wrapper方式相结合, 基于对称不确定性(SU)和近似马尔可夫毯(AMB)的网络流量特征选择算法FSSA, 本文提出的方法首先利用对称不确定性算法选择出对于各个类别具有分类贡献的特征, 去除不相关的特征属性; 然后在候选特征子集中利用近似马尔可夫毯算法删除冗余特征, 最后采用Wrapper方式基于C4.5分类算法, 进行最后的特征优选. 实验表明, 该方法下选择出的特征对网络设备操作系统类型识别的精确率相较于经典的特征选择方法有了一定的提高, 在小类别数据上的召回率也得到了提升.
Abstract:In the research of passive network device identification based on network traffic analysis, much high-dimensional data often appears in the network traffic data, and some of these features do not contribute much to device identification and even can seriously affect the classification results and performance. Therefore, this study proposes a network traffic feature selection algorithm FSSA that combines Filter and Wrapper approaches based on symmetric uncertainty (SU) and approximate Markov blanket (AMB). Specifically, the proposed method in this study first uses the SU algorithm to select the features with classification contributions for each category and remove irrelevant feature attributes. Then, the AMB algorithm is adopted to delete redundant features in the subset of candidate features. Finally, the Wrapper approach based on the C4.5 classification algorithm is employed to determine the final feature preference. The experimental results show that the accuracy of the features selected under this method for type identification of the network device operating system has been improved compared with classical feature selection methods, and the recall rate on small class data has also been raised.
keywords: feature selection network traffic symmetric uncertainty approximate Markov blanket network devices identification machine learning
文章编号: 中图分类号: 文献标志码:
基金项目:兴辽英才计划(XLYC2019019)
引用文本:
庞玉林,李喜旺.基于SU和AMB的网络流量特征选择算法.计算机系统应用,2022,31(4):281-287
PANG Yu-Lin,LI Xi-Wang.Feature Selection Algorithm of Network Traffic Based on SU and AMB.COMPUTER SYSTEMS APPLICATIONS,2022,31(4):281-287
庞玉林,李喜旺.基于SU和AMB的网络流量特征选择算法.计算机系统应用,2022,31(4):281-287
PANG Yu-Lin,LI Xi-Wang.Feature Selection Algorithm of Network Traffic Based on SU and AMB.COMPUTER SYSTEMS APPLICATIONS,2022,31(4):281-287