###
DOI:
计算机系统应用英文版:2015,24(11):31-37
本文二维码信息
码上扫一扫!
宏基因组分类问题中的特征提取及其降维研究
陈波1,2, 徐云1,2
(1.中国科学技术大学计算机科学与技术学院, 合肥 230027;2.中国科学技术大学安徽省高性能计算重点实验室, 合肥 230027)
Features Extraction and Dimensions Reduction in Metagenomic Binning Problem
CHEN Bo1,2, XU Yun1,2
(1.Department of Computer Science and Technology, University of Science and Technology of China, Hefei 230027, China;2.Anhui Province-MOST Co-Key Laboratory of High Performance Computing and its Application, University of Science and Technology of China, Hefei 230027, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 1226次   下载 3539
Received:February 11, 2015    Revised:April 26, 2015
中文摘要: 宏基因组测序序列分类问题是宏基因组学研究的一个重点问题.影响宏基因组分类性能的主要因素是特征向量的提取问题,如何提取并产生合适的特征向量对于提高宏基因组分类问题的分类精度和运行时间有着重大影响.因此,针对宏基因组分类问题的数据特点,利用三阶马尔可夫模型的性质,提出了一种基于转移概率矩阵的特征提取方法,并采用基于互信息的特征选择算法对提取后的特征向量进行降维处理,最后将新提出的特征向量应用到SVM分类算法中,并与相关算法进行了性能对比.结果显示,新提出的特征向量在不同的宏基因组物种之间有着良好的区分度,特别适用于大规模宏基因组数据的分类问题.
Abstract:Metagenomic binning is a fundamental question for metagenomic studies. Features extraction is the main factor which influences the performance of metagenomic binning, and how to extract the appropriate feature vectors will influence the binning accuracy and running time. Therefore, this paper proposes a features extraction method which based on third-order Markov model and transferring probability matrix for metagenomic binning problem. Meanwhile, we employ the features selection method based on mutual information to reduce the dimensions of feature vectors and apply it to support vector machine algorithm for binning as well as making comparisons among similar binning algorithms. The results show that this new features extraction method possesses applicable discriminability among different metagenomic species, which is particularly appropriate for large-scale metagenomic binning problem.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(61033009)
引用文本:
陈波,徐云.宏基因组分类问题中的特征提取及其降维研究.计算机系统应用,2015,24(11):31-37
CHEN Bo,XU Yun.Features Extraction and Dimensions Reduction in Metagenomic Binning Problem.COMPUTER SYSTEMS APPLICATIONS,2015,24(11):31-37