高维数据的频繁封闭模式挖掘算法研究综述

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年4月3日 1:33 星期四

首页 > 过刊浏览>2011年第20卷第11期 >231-235

PDF HTML阅读 XML下载导出引用引用提醒

高维数据的频繁封闭模式挖掘算法研究综述
DOI:
                        
                    
CSTR:
                        
                    
作者:
                        杨风召杨风召
南京财经大学 电子商务系, 南京 210003; 江苏省电子商务重点实验室, 南京 210003
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学基金(71072172);留学人员科技活动择优资助项目(YFZ302002);江苏高校优势学科建设工程资助项目

Mining Frequent Closed Patterns for Very High Dimensional Data: A Review

Author:

YANG Feng-Zhao
YANG Feng-Zhao
E-Business Department, Nanjing University of Finance & Economics, Nanjing 210003, China; Jiangsu Key Laboratory of E-Business, Nanjing 210003, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [10]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

挖掘频繁模式是数据挖掘领域一个重要且基础的问题。频繁封闭项集挖掘可以提供完全的无冗余的频繁模式。随着生物信息学的兴起,产生了一类具有较多列数的特殊数据集,这种高维数据集对以前的频繁封闭模式挖掘算法提出了新的挑战。对高维数据的频繁封闭模式挖掘算法进行了综述,按照算法的特性对这些算法进行了分类,比较了基于行计数的两类挖掘算法,并对能根据数据子集的特性进行列计数和行计数自动转换的混合计数算法进行了讨论,最后指出了该领域的研究方向。

关键词:频繁封闭模式;高维数据;数据挖掘;综述

Abstract:

Mining frequent patterns is a fundamental and essential problem in many data mining applications. Mining frequent closed itemsets provides complete and non-redundant results for frequent pattern analysis. The growth of bioinformatics has resulted in datasets with new characteristics. These datasets typically contain a large number of columns. Such high-dimendional datasets pose a great challenge for existing closed frequent pattern discovery algorithms. This paper presents a survey of the various algorithms for mining frequent closed itemsets in very high dimensional data along with a hierarchy organizing the algorithms by their characteristics. We compare two row enumeration-based algorithms, discuss an algorithm which is designed to automatically switch between feature enumeration and row enumeration during the mining process based on the characteristics of the data subset being considered, and finally point out the research direction in this field.

Key words:frequent closed pattern;high dimensional data;data mining;survey

参考文献

1 Pasquier N, Bastide Y, Taouil R, Lakhal L. Discoveryingfrequent closed itemsets for association rules. In: Beeri C,Buneman P, eds. Proc. of the 7th International Conference onDatabase Theory, LNCS 1540. Heidelberg: Springer Berlin,1999: 398-416.

2 Pei J, Han J, Mao R. CLOSET: An eficient algorithm formining frequent closed itemsets. In: Chen W, Naughton JF,Bernstein PA, eds. Proc. 2000 ACM-SIGMOD InternationalWorkshop Data Mining and Knowledge Discovery. NewYork: ACM Press,2000:21-30.

3 Burdick D, Calimlim M, Gehrke J. MAFIA: A maximalfrequent itemset algorithm for transactional databases. In:Georgakopoulos D, Buchmann A, eds. Proc. of the 17thInternational Conference on Data Engineering. Heidelberg:IEEE Computer Society, 2001:443-452.

4 Zaki M, Hsiao C. Charm: An efficient algorithm for closedassociation rule mining. In: Grossman RL, Han J, Kumar V,Mannila H, Motwani R, eds. Proc. of 2002 SIAMInternational Conference Data Mining. Arlington, VA, 2002:457-473.

5 Wang J, Han J, Pei J. Closet+:Searching for the best stetegiesfor mining frequent closed itemsets. In: Getoor L, SenatorTE, Domingos P, Faloutsos C, eds. Proc. of 2003 ACMSIGKDD International Conference on Kowledge Discoveryand Data Mining. New York: ACM Press, 2003: 236-245.

6 Pan F, Cong G, Tung AK. Carpenter: Finding closed patternsin long biological datasets. In: Getoor L, Senator TE,Domingos P, Faloutsos C, eds. Proc. of 2003 ACM SIGKDDInternational Conference on Kowledge Discovery and DataMining. New York: ACM Press, 2003: 637-642.

7 Cong G, Tung AK, Xu X, et al. FARMER: Finding Interestingrule groups in microarray datasets. In: Weikum G, ed. Proc.of the ACM SIGMOD International Conference onManagement of Data 2004. New York: ACM Press, 2004:143-154.

8 Cong G, Tan K, Tung AK, et al. Mining top-k covering rulegroups for gene expression data. In: Ozcan F, ed. Proc. of theACM SIGMOD International Conference on Management ofData 2005. New York: ACM Press, 2005: 670-681.

9 Liu H, Han J, Xin D, Shao Z. Mining frequent. patterns fromvery high dimensional data: A. top-down row enumerationapproach. In: Ghosh J, Lambert D, Skillicorn DB, SrivastavaJ, eds. Proc. of the Sixth SIAM International Conference onData. Mining. Bethesda: SIAM, 2006: 20-22.

10 Pan F, Tung AK, Cao G, Xu X. COBBLER: Combiningcolumn and row enumeration for closed pattern discovery.In: Hatzopoulos M, Manolopoulos Y, eds. Proc. of 2004International Conference on Scientific and StatisticalDatabase Management. Washington: IEEE ComputerSociety, 2004: 21-30.

引用本文

杨风召.高维数据的频繁封闭模式挖掘算法研究综述.计算机系统应用,2011,20(11):231-235

复制

文章指标

点击次数:2066
下载次数: 6428
HTML阅读次数: 0
引用次数: 0

历史

收稿日期:2011-03-10
最后修改日期:2011-04-19
录用日期:
在线发布日期:
出版日期:

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码