基于文档分层表示的恶意网页快速检测方法

doi:10.15888/j.cnki.csa.007198

微信公众号

网站二维码

首页 > 过刊浏览>2019年第28卷第12期 >226-231. DOI:10.15888/j.cnki.csa.007198

PDF HTML阅读 XML下载导出引用引用提醒

基于文档分层表示的恶意网页快速检测方法
DOI:
                        10.15888/j.cnki.csa.007198
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学基金（91430214）

Hierarchical Representation Approach to Fast Detection of Malicious Webpages

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

近年来，恶意网页检测主要依赖于语义分析或代码模拟执行来提取特征，但是这类方法实现复杂，需要高额的计算开销，并且增加了攻击面.为此，提出了一种基于深度学习的恶意网页检测方法，首先使用简单的正则表达式直接从静态HTML文档中提取与语义无关的标记，然后采用神经网络模型捕获文档在多个分层空间尺度上的局部性表示，实现了能够从任意长度的网页中快速找到微小恶意代码片段的能力.将该方法与多种基线模型和简化模型进行对比实验，结果表明该方法在0.1%的误报率下实现了96.4%的检测率，获得了更好的分类准确率.本方法的速度和准确性使其适合部署到端点、防火墙和Web代理中.

Abstract:

In recent years, the web content detection mainly focuses on how to extract features from HTML document through semantic analysis or emulation execution, while it is undesirable, because it significantly complicates implementation which requires high computational overhead, and opens up an attack surface within the detector. A deep learning approach to detect malicious web pages is proposed. Firstly, we take advantage of the non-complex regular expression to extract tokens from static HTML document, then capture locality representation at multiple hierarchical spatial scales over the document with neural network model, by which the mode can quickly find tiny fragments of malicious code in any length of web pages. The experimental results show that this approach achieves a detection rate of 96.4% at a false positive rate of 0.1%, much better than the baseline and simplified model at the classification accuracy. The speed and accuracy of proposed approach makes it appropriate for deployment to endpoints, firewalls and web proxies.

参考文献

相似文献

引证文献

引用本文

袁梁,林金芳.基于文档分层表示的恶意网页快速检测方法.计算机系统应用,2019,28(12):226-231

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2019-04-25
最后修改日期:2019-05-21
录用日期:
在线发布日期: 2019-12-13
出版日期: 2019-12-15

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码