本文已被:浏览 1755次 下载 2819次
Received:September 09, 2010 Revised:December 30, 2010
Received:September 09, 2010 Revised:December 30, 2010
中文摘要: 为了降低Web 日志数据的规模,并能从预处理后的数据中发现更有价值的访问模式,在引入知识的信息量的基础上,给出了单个属性相对于属性集的重要性量化值的概念,并采用了操作系统中LRU 页面置换算法的思想,提出了基于属性重要性的WUM 数据预处理方式。实验证明:该方式可以删除不具有挖掘价值的、因用户短期行为而访问的Web 日志记录,剔除掉噪音数据,从而有效减小了日志挖掘的复杂度。
中文关键词: 访问模式 LRU 页面置换算法 用户短期行为 噪音数据
Abstract:To reduce the Web log data scale and discover more recommendable access patterns from data preprocessed, with knowledge based on amount of information, the concept of quantify value of importance of every property in relation to property set was proposed, and used the idea of LRU page replacement algorithm in the operating system, a new data preprocessing method based on importance of property was proposed. The experiments show that the method could delete Web log records which were caused by user short-behavior and have not mining value, and filter out the noise data. Accordingly it can reduce the complexity of log mining effectively.
文章编号: 中图分类号: 文献标志码:
基金项目:安徽科技学院青年基金(ZIC2011117);安徽科技学院教研课题(X201014)
引用文本:
王亚军,王传安.基于属性重要性的WUM数据预处理方式.计算机系统应用,2011,20(5):219-222,247
WANG Ya-Jun,WANG Chuan-An.Data Preprocessing Method Based on Importance of Property for WUM.COMPUTER SYSTEMS APPLICATIONS,2011,20(5):219-222,247
王亚军,王传安.基于属性重要性的WUM数据预处理方式.计算机系统应用,2011,20(5):219-222,247
WANG Ya-Jun,WANG Chuan-An.Data Preprocessing Method Based on Importance of Property for WUM.COMPUTER SYSTEMS APPLICATIONS,2011,20(5):219-222,247