基于集成特征选择的FSSD算法

doi:10.15888/j.cnki.csa.008373

微信公众号

网站二维码

首页 > 过刊浏览>2022年第31卷第3期 >275-281. DOI:10.15888/j.cnki.csa.008373

PDF HTML阅读 XML下载导出引用引用提醒

基于集成特征选择的FSSD算法
DOI:
                        10.15888/j.cnki.csa.008373
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:福建省自然科学基金(2018J01794)

FSSD Algorithm Based on Ensemble Feature Selection

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

FSSD (fast and efficient subgroup set discovery)是一种子群发现算法, 旨在短时间内提供多样性模式集, 然而此算法为了减少运行时间, 选择域数量少的特征子集, 当特征子集与目标类不相关或者弱相关时, 模式集质量下降. 针对这个问题, 提出一种基于集成特征选择的FSSD算法, 它在预处理阶段使用基于ReliefF (Relief-F)和方差分析的集成特征选择来获得多样性和相关性强的特征子集, 再使用FSSD算法返回高质量模式集. 在UCI数据集、全国健康和营养调查报告(NHANES)数据集上的实验结果表明, 改进后的FSSD算法提高了模式集质量, 归纳出更有趣的知识. 在NHANES数据集上, 进一步分析模式集的特征有效性和阳性预测值.

Abstract:

Fast And Efficient Subgroup Set Discovery (FSSD) is a subgroup discovery algorithm that aims to provide a diverse set of patterns in a short period of time. However, in order to reduce the running time, this algorithm selects a feature subset with a small number of domains. When the feature subset is irrelevant or weakly related to the target class, the quality of the pattern set decreases. To solve this problem, this paper proposes a FSSD algorithm based on ensemble feature selection. In the preprocessing stage, it uses ensemble feature selection based on ReliefF (Relief-F) and analysis of variance to obtain feature subset with diversity and strong correlation, and then uses FSSD algorithm to return high-quality pattern set. The experimental results on the UCI datasets and the National Health and Nutrition Examination Survey (NHANES) dataset show that the improved FSSD algorithm improves the quality of the pattern set, thereby summarizing more interesting knowledge. Furthermore, the feature validity and positive predictive value of the pattern set were further analyzed on the NHANES dataset.

参考文献

相似文献

引证文献

引用本文

张崟,何振峰.基于集成特征选择的FSSD算法.计算机系统应用,2022,31(3):275-281

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2021-05-19
最后修改日期:2021-06-14
录用日期:
在线发布日期: 2022-01-24
出版日期:

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码