###
计算机系统应用:2018,27(9):163-169
本文二维码信息
码上扫一扫!
基于16S rRNA序列物种鉴定的改进向量空间模型算法
祝斌1,2, 亓合媛3, 马俊才1,3
(1.中国科学院 计算机网络信息中心, 北京 100190;2.中国科学院大学, 北京 100049;3.中国科学院 微生物研究所, 北京 100101)
Improved VSM Algorithm in Species Identification Based on 16S rRNA Gene Sequences
ZHU Bin1,2, QI He-Yuan3, MA Jun-Cai1,3
(1.Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China;2.University of Chinese Academy of Sciences, Beijing 100049, China;3.Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 182次   下载 225
投稿时间:2018-02-01    修订日期:2018-02-28
中文摘要: 在物种鉴定领域中,权威方法是基于BLAST的序列比对算法,然而该算法出现计算量过于庞大,运算效率低以及资源消耗较高等问题.为解决以上问题,本文借鉴经典文献中的K-String组份向量方法,对向量空间模型作出改进,将其应用于基于16S rRNA序列的物种鉴定领域,并在巴拿赫空间的理论体系下,对改进向量空间模型算法中的遗传距离公式进行等价替换,给出不同范数背景下对应的遗传距离公式,供科研人员参考.本文从计算效率和物种鉴定效果两个方面来判断改进算法的性能,最终得到如下结论:欧几里得空间下的内积范数从计算效率上较经典的blast算法具有显著优势,而其分类效果在检出率这一方面,达到了比对结果的一致性.
Abstract:In the field of species identification, the traditional algorithm is based on the BLAST method, which is regarded as the authoritative method, but the method has a series of problems such as complex calculating process, time-consuming, as well as space-consuming. In this study, we propose an improved VSM algorithm based on K-String compositional vector method, and give the alternative norm-format formula in calculating the genetic distance between species in the Banach space for the reference of other scientific researchers. In this study, the computational efficiency and the result of the species identification are the two aspects to determine the properties of the improved method. The conclusion is that the calculating time of improved VSM algorithm based on 2-norm has decreased obviously than that of the BLAST algorithm, in addition, the result of classification demonstrates good consistence and convergence with the comparison result in terms of detection rate.
文章编号:     中图分类号:    文献标志码:
基金项目:国家高技术研究发展计划(863计划)(2014AA021501)
引用文本:
祝斌,亓合媛,马俊才.基于16S rRNA序列物种鉴定的改进向量空间模型算法.计算机系统应用,2018,27(9):163-169
ZHU Bin,QI He-Yuan,MA Jun-Cai.Improved VSM Algorithm in Species Identification Based on 16S rRNA Gene Sequences.COMPUTER SYSTEMS APPLICATIONS,2018,27(9):163-169

用微信扫一扫

用微信扫一扫