本文已被:浏览 1383次 下载 2548次
Received:January 28, 2015 Revised:March 18, 2015
Received:January 28, 2015 Revised:March 18, 2015
中文摘要: 分析数据去重的重要意义, 根据现有的数据去重技术和算法, 改进MD5码指纹的计算算法并进行优化, 分析并重组指纹计算的流水化方法, 利用缓存组代替单个缓存的方式, 提出一种基于多CPU的两级指纹流水计算方法, 对该方法进行分析研究, 并通过相关试验和试验数据来支持该方法的有效性.
Abstract:This paper analyzes the importance of data removal. According to the existing data of the removal techniques and algorithms, it improves MD5 code fingerprint algorithm to calculate and optimize it, analyzes and recombines the fingerprint calculation of water level. Using the cache group to replace the single cache, we propose a new method to calculate the two water level fingerprints based on multi CPU to study and analyze the method. At the same time, it supports effectiveness of the method through the relevant tests and test data.
文章编号: 中图分类号: 文献标志码:
基金项目:国家档案局项目(2014-X-65);四川省教育厅一般项目(14ZB0313)
引用文本:
贺建英,袁小艳,唐青松.大数据下基于多CPU的两级指纹流水计算去重方法.计算机系统应用,2015,24(8):206-211
HE Jian-Ying,YUAN Xiao-Yan,TANG Qing-Song.Duplicate Removal Method of Large Data under Two Level Fingerprins Flow Based on Multi CPU Calculation.COMPUTER SYSTEMS APPLICATIONS,2015,24(8):206-211
贺建英,袁小艳,唐青松.大数据下基于多CPU的两级指纹流水计算去重方法.计算机系统应用,2015,24(8):206-211
HE Jian-Ying,YUAN Xiao-Yan,TANG Qing-Song.Duplicate Removal Method of Large Data under Two Level Fingerprins Flow Based on Multi CPU Calculation.COMPUTER SYSTEMS APPLICATIONS,2015,24(8):206-211