###

计算机系统应用英文版:2018,27(6):27-33

View/Add Comment 过刊浏览高级检索 HTML

←前一篇 | 后一篇→

码上扫一扫！

下载全文

基于深度学习的恶意URL识别

陈康, 付华峥, 向勇

(中国电信股份有限公司广东研究院, 广州 510630)

Malicious URL Detection Based on Deep Learning

CHEN Kang, FU Hua-Zheng, XIANG Yong

(Guangdong Research Institute, China Telecom Co. Ltd., Guangzhou 510630, China)

摘要

图/表

参考文献

相似文献

本文已被：浏览 4134次下载 5255次
Received:September 28, 2017 Revised:October 10, 2017

中文摘要: 网络攻击日益成为一个严重的问题.在这些攻击中，恶意URLs经常扮演着重要角色，并被广泛应用到各种类型的攻击，比如钓鱼、垃圾邮件以及恶意软件中.检测恶意链接对于阻止这些攻击具有重要意义.多种技术被应用于恶意URLs的检测，而近年来基于机器学习的方法得到越来越多的重视.但传统的机器学习算法需要大量的特征预处理工作，非常耗时耗力.在本文中，我们提出了一个完全基于词法特征的检测方法.首先，我们训练一个2层的神经网络，得到URLs中的字符的分布表示，然后训练对URL的分布表示生成的特征图像进行分类.在我们的试验中，使用真实数据，取得了精度为0.973和F1为0.918的结果.

中文关键词: 恶意URLs 机器学习词法特征卷积神经网络

Abstract:Increase of cyber-attacks is now becoming a serious problem. Among these attacks, malicious URL often plays an import role. It has been widely used to mount various cyber attacks including phishing, spamming, and malware. Detection of malicious URLs is critical to thwart these attacks. Numerous techniques are developed to detect malicious URLs and machine learning techniques have been explored with increasing attention in recent years. However, traditional machine learning methods require tedious work of features preprocessing and it is very time-consuming. In this study, we propose a detection method based solely on lexical features of URLs. First, we obtain the distributed representation of characters in URLs by training a 2-layer Neural Network (NN). Then we train the Convolutional NN (CNN) to classify feature images which are generated by mapping the URL to its distributed representation. In our experience, we obtained a reasonable accuracy of 97.3% and F1 of 91.8% using the real-world data set.

keywords: malicious URLs machine learning distributed representation of characters Convolutional Neural Network (CNN)

文章编号： 中图分类号： 文献标志码：

基金项目:广东省重大专项（2015B010109005）

Author Name	Affiliation	E-mail
CHEN Kang	Guangdong Research Institute, China Telecom Co. Ltd., Guangzhou 510630, China
FU Hua-Zheng	Guangdong Research Institute, China Telecom Co. Ltd., Guangzhou 510630, China	18026262485@163.com
XIANG Yong	Guangdong Research Institute, China Telecom Co. Ltd., Guangzhou 510630, China

Author Name	Affiliation	E-mail
CHEN Kang	Guangdong Research Institute, China Telecom Co. Ltd., Guangzhou 510630, China
FU Hua-Zheng	Guangdong Research Institute, China Telecom Co. Ltd., Guangzhou 510630, China	18026262485@163.com
XIANG Yong	Guangdong Research Institute, China Telecom Co. Ltd., Guangzhou 510630, China

引用文本：
陈康,付华峥,向勇.基于深度学习的恶意URL识别.计算机系统应用,2018,27(6):27-33
CHEN Kang,FU Hua-Zheng,XIANG Yong.Malicious URL Detection Based on Deep Learning.COMPUTER SYSTEMS APPLICATIONS,2018,27(6):27-33