###
计算机系统应用英文版:2019,28(2):240-245
本文二维码信息
码上扫一扫!
基于随机森林的WebShell检测方法
(1.武汉邮电科学研究院, 武汉 430074;2.南京烽火星空通信发展有限公司, 南京 210019)
Webshell Detection Method Based on Random Forest
(1.Wuhan Research Institute of Posts and Telecommunications, Wuhan 430074, China;2.FiberHome Communications Science & Technology Development Co. Ltd., Nanjing 210019, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 1632次   下载 2417
Received:June 29, 2018    Revised:July 20, 2018
中文摘要: WebShell根据其功能和大小可以分为多种类型,各种类型的WebShell在基本特征上又有其独有的特征,而现有的WebShell检测大多从单一层面提取特征,无法较全面的覆盖各种类型WebShell全部特征,具有种类偏向性,无差别的检测效果差,泛化能力弱等问题.针对这一问题,提出了一种基于随机森林的WebShell检测方法.该方法在数据预处理阶段分别提取文本层的统计特征和文本层源码与编译结果层字节码(opcode)的序列特征,构成较全面的组合特征,然后通过Fisher特征选择选取适当比例的重要特征,降低特征维度,构成样本的特征集,最后采用随机森林分类器训练样本得到检测模型.通过实验表明,本检测方法能有效地检测WebShell,并在准确率、召回率和误报率上都优于单一层面的WebShell检测模型.
中文关键词: WebShell  随机森林  组合特征  特征选择
Abstract:WebShell can be divided into various types according to its function and size; they have basic features and unique features. However, most existing WebShell detection only extracts features from single level, they cannot cover all the features of various types of WebShell in a more comprehensive way. These detections have problems such as kind bias, poor detection effect, weak generalization ability, etc. To solve these problems, a random forest based WenShell detection method is proposed. In the data preprocessing stage, this method extracts the statistical features of the text layer, and the sequence characteristics of the text layer sources and the compilation result layer opcode, to form a comprehensive combination features. Then, the feature set of the sample is formed by using Fisher feature selection to select important features with the appropriate proportion to reduce the feature dimension. Finally, the random forest classifier is used to train samples to get the detection model. The experiment shows that this detection method can detect WebShell effectively, and it is superior to the single level WebShell detection model in accuracy, recall, and false alarm rate.
文章编号:     中图分类号:    文献标志码:
基金项目:
引用文本:
秦英.基于随机森林的WebShell检测方法.计算机系统应用,2019,28(2):240-245
QIN Ying.Webshell Detection Method Based on Random Forest.COMPUTER SYSTEMS APPLICATIONS,2019,28(2):240-245