Rabin指纹去重算法在搜索引擎中的应用

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年7月29日 12:08 星期二

首页 > 过刊浏览>2015年第24卷第7期 >128-131

PDF HTML阅读 XML下载导出引用引用提醒

Rabin指纹去重算法在搜索引擎中的应用
DOI:
                        
                    
CSTR:
                        
                    
作者:
                        贺建英贺建英
四川文理学院 计算机学院, 达州 635000
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家档案局项目(2014-X-65)

Application of Duplication Removal Method of Rabin Fingerprint in Search Engine

Author:

HE Jian-Ying
HE Jian-Ying
College of Computer, Sichuan University of Arts and Science, Dazhou 635000, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

针对搜索引擎在海量数据中搜索速度慢, 占用存储空间大, 对重复的网页去重性差的现状, 提出一种基于Rabin指纹算法的去重方法, 不仅对搜索到的URL地址进行去重, 还对非重复URL地址对应的网页内容进行相似和相同的去重, 试验表明能有效地提高搜索速度、节省存储空间, 增强搜索的精度.

关键词:Rabin指纹方法;搜索引擎;去重;URL;海量数据

Abstract:

The existing search engine of massive data takes up large memory, needs much time and provides results of great duplication rate. To overcome these disadvantages, this paper proposes a duplication removal method based on the Rabin Fingerprint method, which cannot only remove the duplicated URL, but also remove the same even similar website content on different URL so that it can speed up the searching speed, save the memory capability and improve the accuracy of the research.

Key words:Rabin fingerprinting method;search engine;duplicate removal;URL;massive data

引用本文

贺建英. Rabin指纹去重算法在搜索引擎中的应用.计算机系统应用,2015,24(7):128-131

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2014-11-04
最后修改日期:2014-12-08
录用日期:
在线发布日期: 2015-07-17
出版日期:

微信公众号

网站二维码

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码