Reservoir Data Mining and Analysis Based on Spark

doi:10.15888/j.cnki.csa.005985

AIPUB归智期刊联盟

WeChat

Mobile website

2025-8-3- 0

Home > Archive>Volume 26, Issue 8, 2017 >9-15. DOI:10.15888/j.cnki.csa.005985

PDF HTML XML Export Cite reminder

Reservoir Data Mining and Analysis Based on Spark
DOI:
                        10.15888/j.cnki.csa.005985
                    
CSTR:
                        
                    
Author:
                        WU Zhi-JunWU Zhi-Jun
Computer and Communication Engineering, China University of Pertroleum, Qingdao 266580, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
XIA Sheng-YuXIA Sheng-Yu
Computer and Communication Engineering, China University of Pertroleum, Qingdao 266580, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
WANG PengWANG Peng
Computer and Communication Engineering, China University of Pertroleum, Qingdao 266580, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

In order to improve the analysis of reservoir properties and oil exploration and development process, this paper analyzes data and finds relationships between reservoir properties using Spark parallel computing framework and data mining algorithm, and classifies and predicts different reservoir segments. The main work in this paper includes: building the Spark distributed clustering and data processing and analysis platform, Spark being a popular big data parallel computing framework, which can achieve fast and accurate data mining tasks compared with some traditional analysis methods and tools; establishing a multidimensional outlier detection function according to the characteristics of reservoir data and adding a new discriminant attribute Pr; proposing a cross-recall training model and optimized cost function for logistic regression classification in dealing with the imbalanced data. KR-SMOTE is used to oversample for decession tree classification that both improve the classification precision.

Key words:Spark;data mining;outlier detection;imbalanced data;classification

Get Citation

武志军,夏盛瑜,王鹏.基于Spark的油藏数据挖掘与分析.计算机系统应用,2017,26(8):9-15

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:December 09,2016
Revised:
Adopted:
Online: October 31,2017
Published:

Article QR Code

You are the first1025913Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063