###
计算机系统应用英文版:2018,27(6):27-33
本文二维码信息
码上扫一扫!
基于深度学习的恶意URL识别
(中国电信股份有限公司 广东研究院, 广州 510630)
Malicious URL Detection Based on Deep Learning
(Guangdong Research Institute, China Telecom Co. Ltd., Guangzhou 510630, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 4079次   下载 4954
Received:September 28, 2017    Revised:October 10, 2017
中文摘要: 网络攻击日益成为一个严重的问题.在这些攻击中,恶意URLs经常扮演着重要角色,并被广泛应用到各种类型的攻击,比如钓鱼、垃圾邮件以及恶意软件中.检测恶意链接对于阻止这些攻击具有重要意义.多种技术被应用于恶意URLs的检测,而近年来基于机器学习的方法得到越来越多的重视.但传统的机器学习算法需要大量的特征预处理工作,非常耗时耗力.在本文中,我们提出了一个完全基于词法特征的检测方法.首先,我们训练一个2层的神经网络,得到URLs中的字符的分布表示,然后训练对URL的分布表示生成的特征图像进行分类.在我们的试验中,使用真实数据,取得了精度为0.973和F1为0.918的结果.
Abstract:Increase of cyber-attacks is now becoming a serious problem. Among these attacks, malicious URL often plays an import role. It has been widely used to mount various cyber attacks including phishing, spamming, and malware. Detection of malicious URLs is critical to thwart these attacks. Numerous techniques are developed to detect malicious URLs and machine learning techniques have been explored with increasing attention in recent years. However, traditional machine learning methods require tedious work of features preprocessing and it is very time-consuming. In this study, we propose a detection method based solely on lexical features of URLs. First, we obtain the distributed representation of characters in URLs by training a 2-layer Neural Network (NN). Then we train the Convolutional NN (CNN) to classify feature images which are generated by mapping the URL to its distributed representation. In our experience, we obtained a reasonable accuracy of 97.3% and F1 of 91.8% using the real-world data set.
文章编号:     中图分类号:    文献标志码:
基金项目:广东省重大专项(2015B010109005)
引用文本:
陈康,付华峥,向勇.基于深度学习的恶意URL识别.计算机系统应用,2018,27(6):27-33
CHEN Kang,FU Hua-Zheng,XIANG Yong.Malicious URL Detection Based on Deep Learning.COMPUTER SYSTEMS APPLICATIONS,2018,27(6):27-33