本文已被:浏览 444次 下载 1078次
Received:June 09, 2023 Revised:July 12, 2023
Received:June 09, 2023 Revised:July 12, 2023
中文摘要: 基于大量历史数据的预测在环境治理、城市交通等领域已经成为必不可少的一个环节, 预测的准确性对实际生产、调度等工作有着重要影响. 受自然因素或人为因素的影响, 一些数据表现出高波动性以及不确定性, 导致无法充分发挥预测模型的最大优势. 本研究以非结冰期含沙量预测为案例, 探究了针对高波动性数据预测的优化方法, 发现通过基于SHAP方法的特征选择优化、数据平稳化以及前期聚类可以有效降低对高波动数据的预测误差, MAE从初始模型的1.502降低到0.194, 其中数据平稳化的优化效果最显著, MAE降低了76.51%, 但随着平稳化的阶数增大, 预测结果反而变差, 这是由于随着平稳化的阶数越高, 后续指数化的阶数也对应提升, 从而导致误差的指数倍增长. 此外, 将聚类结果作为特征输入可以有效“引导”多层感知机的参数学习过程.
Abstract:Prediction based on historical data has become essential in many fields, such as environmental management and urban transportation. Prediction accuracy plays a key role in practical production, scheduling, and other tasks. However, due to natural or human factors, some data exhibits high volatility and uncertainty, unable to fully achieve the potential of prediction models. Taking the sediment concentration prediction during the non-ice period as a case study, this study explores optimization methods for predicting high-volatility data. The results show that the feature selection optimization based on the Shapley additive explanations (SHAP), the data smoothing, and early-stage clustering can reduce prediction error of high-volatility data. The mean absolute error (MAE) decreases from 1.502 in the initial model to 0.194, and data smoothing shows the most significant optimization effect with a reduction of 76.51% in MAE. However, the increasing smoothing order results in poorer prediction results, which is because the subsequent rising exponentiation order correspondingly leads to an exponential increase in error. Additionally, employing clustering results as feature inputs can “guide” the parameter learning of multi-layer perceptron.
keywords: high-volatility data prediction optimization artificial neural network model interpretability feature selection
文章编号: 中图分类号: 文献标志码:
基金项目:
引用文本:
白鹭,鲁思琪,信昆仑,任鹏,朱赫,穆旭东.神经网络与解释模型在非结冰期含沙量预测中的应用.计算机系统应用,2023,32(12):276-283
BAI Lu,LU Si-Qi,XIN Kun-Lun,REN Peng,ZHU He,MU Xu-Dong.Application of Neural Networks and Interpretation Models in Sediment Concentration Prediction During Non-ice Period.COMPUTER SYSTEMS APPLICATIONS,2023,32(12):276-283
白鹭,鲁思琪,信昆仑,任鹏,朱赫,穆旭东.神经网络与解释模型在非结冰期含沙量预测中的应用.计算机系统应用,2023,32(12):276-283
BAI Lu,LU Si-Qi,XIN Kun-Lun,REN Peng,ZHU He,MU Xu-Dong.Application of Neural Networks and Interpretation Models in Sediment Concentration Prediction During Non-ice Period.COMPUTER SYSTEMS APPLICATIONS,2023,32(12):276-283