###

DOI:

计算机系统应用英文版:2012,21(2):210-213

View/Add Comment 过刊浏览高级检索 HTML

←前一篇 | 后一篇→

码上扫一扫！

下载全文

基于Hadoop 的分布式朴素贝叶斯文本分类

卫洁, 石洪波, 冀素琴

(山西财经大学信息管理学院,太原 030006)

Distributed Naive Bayes Text Classification Using Hadoop

WEI Jie, SHI Hong-Bo, JI Su-Qin

(Faculty of Information Management, Shanxi University of Finance & Economics, Taiyuan 030006, China)

摘要

图/表

参考文献

相似文献

本文已被：浏览 2986次下载 5725次
Received:May 27, 2011 Revised:July 09, 2011

中文摘要: 云计算的诞生,有效地解决了海量数据集的存储和分析处理。在云计算实现的开源Hadoop 分布式系统集群上,使用MapReduce 并行编程模型,设计并实现了一种对TFIDF 改进的分布式朴素贝叶斯文本分类算法。实验结果表明,基于Hadoop 框架的分布式朴素贝叶斯文本自动分类器不仅能处理节点失效,同时具有高效性和易扩展性的优势。

中文关键词: Hadoop 朴素贝叶斯 MapReduce 文本分类

Abstract:The emergence of the cloud computing has resolved the difficult of storing the abundant data and analysing data processing effectively. Based on the Hadoop open-source implementation, the cloud computing clusters distributable systems. Meanwhile, the usage of MapReduce parallel programming model has implemented a modified distribution on TFIDF Naive Bayes text classification algorithm. The experimental results show that improved TFIDF has chosen this unique method. The Distributed Hadoop framework has based on Bayes text which classifies automatically. This new achievement can not only handle the failure of nodes, but also possess high reliability and much more scalable advantages.

keywords: Hadoop naive bayes MapReduce text classification

文章编号： 中图分类号： 文献标志码：

基金项目:国家自然科学基金(60873100);山西财经大学科研资助项目

Author Name	Affiliation
WEI Jie	Faculty of Information Management, Shanxi University of Finance & Economics, Taiyuan 030006, China
SHI Hong-Bo	Faculty of Information Management, Shanxi University of Finance & Economics, Taiyuan 030006, China
JI Su-Qin	Faculty of Information Management, Shanxi University of Finance & Economics, Taiyuan 030006, China

Author Name	Affiliation
WEI Jie	Faculty of Information Management, Shanxi University of Finance & Economics, Taiyuan 030006, China
SHI Hong-Bo	Faculty of Information Management, Shanxi University of Finance & Economics, Taiyuan 030006, China
JI Su-Qin	Faculty of Information Management, Shanxi University of Finance & Economics, Taiyuan 030006, China

引用文本：
卫洁,石洪波,冀素琴.基于Hadoop 的分布式朴素贝叶斯文本分类.计算机系统应用,2012,21(2):210-213
WEI Jie,SHI Hong-Bo,JI Su-Qin.Distributed Naive Bayes Text Classification Using Hadoop.COMPUTER SYSTEMS APPLICATIONS,2012,21(2):210-213