Algorithms for Biological Sequence K-mer Frequency Counting Problem
CSTR:
Author:
  • Article
  • | |
  • Metrics
  • |
  • Reference [6]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    K-mer counting of biological sequence is a fundamental and very important problem in biological information processing. This paper focuses on counting k-mers at each position of multiple sequences within aligned mode. We present a new backward traverse k-mer counting algorithm called BTKC. BTKC algorithm takes full advantage of the k+1-mer's statistic information to obtain k-mer's statistic information quickly. Thus, it's no need to traverse the whole sequences when counting each single k-mer. Both the algorithm's time complexity and experiment results show that BTKC gets an obvious improvement compared with forward traverse k-mer counting algorithm FTKC, and its time complexity was found not to be realted with the range of k-mer length.

    Reference
    1 Hao BL. Fractals from genomes-exact solutions of a biology-inspired problem. Physica A: Statistical Mechanics and its Applications, 2000, 282(1): 225-246.
    2 Hao B, Lee HC, Zhang S. Fractals related to long DNA sequences and complete genomes. Chaos, Solitons & Fractals, 2000, 11(6): 825-836.
    3 王树林,王戟,陈火旺,等.k-长 DNA 子序列计数算法研究.计算机工程,2007,9:14.
    4 Marçais G, Kingsford C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics, 2011, 27(6): 764-770.
    5 Melsted P, Pritchard JK. Efficient counting of k-mers in DNA sequences using a bloom filter. BMC bioinformatics, 2011, 12(1): 333.
    6 Rizk G, Lavenier D, Chikhi R. DSK: k-mer counting with very low memory usage. Bioinformatics, 2013, 29(5): 652-653.
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

张鑫鑫,陈波,何继凌,徐云.生物序列数据K-mer频次统计问题的算法.计算机系统应用,2014,23(4):121-124,158

Copy
Share
Article Metrics
  • Abstract:1803
  • PDF: 6799
  • HTML: 0
  • Cited by: 0
History
  • Received:August 29,2013
  • Revised:September 26,2013
  • Online: April 25,2014
Article QR Code
You are the first990475Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063