基于MapReduce的高效用序列模式挖掘算法

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年7月28日 9:46 星期一

首页 > 过刊浏览>2015年第24卷第12期 >228-232

PDF HTML阅读 XML下载导出引用引用提醒

基于MapReduce的高效用序列模式挖掘算法
DOI:
                        
                    
CSTR:
                        
                    
作者:
                        程思远程思远
复旦大学计算机科学技术学院, 上海 201203;上海市数据科学重点实验室(复旦大学), 上海 201203
在期刊界中查找
在百度中查找
在本站中查找
马超马超
复旦大学计算机科学技术学院, 上海 201203;上海市数据科学重点实验室(复旦大学), 上海 201203
在期刊界中查找
在百度中查找
在本站中查找
李聪聪李聪聪
复旦大学计算机科学技术学院, 上海 201203;上海市数据科学重点实验室(复旦大学), 上海 201203
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:

High Utility Sequential Pattern Mining Algorithm Based on MapReduce

Author:

CHENG Si-Yuan
CHENG Si-Yuan
School of Computer Science, Fudan University, Shanghai 201203, China;Shanghai Key Laboratory of Data Science, Fudan University, Shanghai 201203, China
在期刊界中查找
在百度中查找
在本站中查找
MA Chao
MA Chao
School of Computer Science, Fudan University, Shanghai 201203, China;Shanghai Key Laboratory of Data Science, Fudan University, Shanghai 201203, China
在期刊界中查找
在百度中查找
在本站中查找
LI Cong-Cong
LI Cong-Cong
School of Computer Science, Fudan University, Shanghai 201203, China;Shanghai Key Laboratory of Data Science, Fudan University, Shanghai 201203, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [12]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

由于数据规模的快速增长,高效用序列模式挖掘算法效率严重下降.针对这种情况,提出基于MapReduce的高效用序列模式挖掘算法HusMaR.算法基于MapReduce框架,使用效用矩阵高效地生成候选项;使用随机映射策略均衡计算资源;使用基于领域的剪枝策略来防止组合爆炸.实验结果表明,在大规模数据集下,算法取得了较高的并行效率.

关键词:序列模式;MapReduce;剪枝策略;高效用序列模式挖掘;随机策略

Abstract:

Because of the rapid growth of data, the high utility sequential pattern mining algorithms' efficiency decreases seriously. In view of this, we propose a high utility sequential pattern mining algorithm based on MapReduce, namely HusMaR. This algorithm is based on MapReduce, which using the utility matrix to generate candidate efficiently, random mapping strategy to balance of computing resources and field-based pruning strategy to prevent an explosion. Experimental results show that in the large scale of data, the algorithm achieves a high parallel efficiency.

Key words:sequential pattern;MapReduce;pruning strategy;high utility sequential pattern mining;random strategy

参考文献

1 Zaki MJ. SPADE:An efficient algorithm for mining frequent sequences. Machine Learning, 2001, 42(1-2):31-60.

2 Pei J, Pinto H, Chen Q, et al. Prefixspan:Mining sequential patterns efficiently by prefix-projected pattern growth. IEEE 29th International Conference on Data Engineering (ICDE). IEEE Computer Society, 2001.

3 Ayres J, Flannick J, Gehrke J, et al. Sequential pattern mining using a bitmap representation. Proc. of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2002:429-435.

4 Yin J, Zheng Z, Cao L. Uspan:an efficient algorithm for mining high utility sequential patterns. Proc. of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2012:660-668.

5 Shie BE, Hsiao HF, Tseng VS, et al. Mining high utility mobile sequential patterns in mobile commerce environments. Database Systems for Advanced Applications. Springer Berlin Heidelberg, 2011:224-238.

6 Ahmed CF, Tanbeer SK, Jeong BS. A novel approach for mining high-utility sequential patterns in sequence databases. ETRI Journal, 2010, 32(5):676-686.

7 Dean J, Ghemawat S. MapReduce:simplified data processing on large clusters. Communications of the ACM, 2008, 51(1):107-113.

8 Wei Y, Liu D, Duan L. Distributed PrefixSpan algorithm based on MapReduce. 2012 International Symposium on Information Technology in Medicine and Education (ITME). IEEE, 2012, 2:901-904.

9 Chen CC, Tseng CY, Chen MS. Highly scalable sequential pattern mining based on MapReduce model on the cloud. 2013 IEEE International Congress on Big Data. IEEE, 2013:310-317.

10 Agrawal R, Srikant R. Mining sequential patterns. Proc. of the 11th Int. Conf. on Data Engineering, 1995. IEEE, 1995:3-14.

11 Fournier-Viger P, Wu CW, Gomariz A, et al. VMSP:Efficient Vertical Mining of Maximal Sequential Patterns. Advances in Artificial Intelligence. Springer International Publishing, 2014:83-94.

12 Yin J, Zheng Z, Cao L, et al. Efficiently mining top-K high utility sequential patterns. 2013 IEEE 13th International Conference on Data Mining,. IEEE, 2013:1259-1264.

引用本文

程思远,马超,李聪聪.基于MapReduce的高效用序列模式挖掘算法.计算机系统应用,2015,24(12):228-232

复制

文章指标

点击次数:2784
下载次数: 2873
HTML阅读次数: 0
引用次数: 0

历史

收稿日期:2015-04-07
最后修改日期:2015-05-12
录用日期:
在线发布日期: 2015-12-04
出版日期:

微信公众号

网站二维码

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码