本文已被:浏览 2143次 下载 4074次
Received:December 17, 2009 Revised:January 18, 2010
Received:December 17, 2009 Revised:January 18, 2010
中文摘要: 目前各种基于规则的分类方法在电子邮件过滤中起到了良好的效果,在邮件过滤器的训练中,训练集中会存在部分邮件具有邮件类别模糊的现象,如何将训练集中的此类类别界限模糊的邮件提取出来将会对邮件的分类效果有明显提高的作用。提出一种基于聚类的过滤方法,根据界限模糊邮件数据之间的共性特征,对邮件训练集进行聚类。实验表明,与单纯的进行基于规则的分类算法相比,这种方法在各项评价指标上具有优越性。
Abstract:Presently, a variety of rule-based classification methods in e-mail filtering obtain good results. In the training of e-mail filtering, the training set has the notion that some e-mail messages will be sent to the hazy category. Extracting these e-mails from training set will have a noticeable increase in the results of classification. Therefore, a clustering-based filtering method is proposed in this paper. The common features of the hazy-category email include cluster the training set. Experiments demonstrate that the method has better performance on the appraisal standard than that of a simple rule-based classification algorithm.
keywords: clustering text categorization spam
文章编号: 中图分类号: 文献标志码:
基金项目:安徽省基金课题(090412044)
Author Name | Affiliation |
LANG Jia-Yun | 合肥工业大学 计算机与信息学院 安徽 合肥 230009 |
HU Xue-Gang |
Author Name | Affiliation |
LANG Jia-Yun | 合肥工业大学 计算机与信息学院 安徽 合肥 230009 |
HU Xue-Gang |
引用文本:
郎加云,胡学钢.基于聚类的类别模糊邮件过滤方法.计算机系统应用,2010,19(9):147-150
LANG Jia-Yun,HU Xue-Gang.Clustering-Based Email Filtering Method with Hazy Category.COMPUTER SYSTEMS APPLICATIONS,2010,19(9):147-150
郎加云,胡学钢.基于聚类的类别模糊邮件过滤方法.计算机系统应用,2010,19(9):147-150
LANG Jia-Yun,HU Xue-Gang.Clustering-Based Email Filtering Method with Hazy Category.COMPUTER SYSTEMS APPLICATIONS,2010,19(9):147-150