﻿ 基于动态加权组合模型的ATM现金预测方法
 计算机系统应用  2020, Vol. 29 Issue (8): 24-30 PDF

ATM Cash Forecasting Method Based on Dynamic Weighted Combination Model
DU Shan, CAI Wei-Bin
Software Development Center, ICBC, Zhuhai 519000, China
Abstract: A wise cash forecasting method based on a dynamic weighted combination model is proposed in this study, to precisely predict the daily cash consumption of ATM equipments so as to make a better decision for daily cash transfer management. Different from single-algorithm prediction used in the past, with analyzing characteristics of banking business, transaction flow, and equipment, etc., an intelligent algorithm based on a dynamic weighted combination model that combining 4 single machine learning models, is proposed and implemented in this study. This algorithm provides a more intelligent, more precise, and more efficient forecasting method for the management of bank cash consumption, effectively reduces the total amount of cash inventory and the rate of cash return, and improves the utilization rate of cash. This method has been used in Guangdong, Chongqing, Jiangxi, Shanxi, Beijing, and other areas with sound results.
Key words: cash forecasting     machine learning     dynamic weighted method

1 引言

2 构建动态加权组合模型

2.1 建立均匀权重向量集

ATM现金预测系统整体流程图如图1所示, 其中最核心的模块是模型集成模块, 它确保了系统模型预测效果的稳定性和准确性.

 图 1 现金预测系统整体流程图

 $\begin{split} A = &\left\{ { \frac{{}}{{}}\left\{ {a_{11}},{a_{12}},{a_{13}},\cdots,{a_{1n}}\right\} ,\left\{ {a_{21}},{a_{22}},{a_{23}}, \cdots,{a_{2n}}\right\} ,\cdots,} \right. \\ & \left.{\left\{ { {a_{A_{H + 1}^{n - 1}1}},{a_{A_{H + 1}^{n - 1}2}},{a_{A_{H + 1}^{n - 1}3}},\cdots,{a_{A_{H + 1}^{n - 1}n}} }\right\} }\right\}, \\ \end{split}$

 $\begin{split} I = & \left\{ \{ \dfrac{{{a_{11}}}}{H},\dfrac{{{a_{12}}}}{H},\dfrac{{{a_{13}}}}{H},\cdots,\dfrac{{{a_{1n}}}}{H}\} ,\{ \dfrac{{{a_{21}}}}{H},\dfrac{{{a_{22}}}}{H}, \dfrac{{{a_{23}}}}{H},\cdots,\dfrac{{{a_{2n}}}}{H}\} ,\cdots, \right.\\ & \left. \{ \dfrac{{{a_{A_{H + 1}^{n - 1}1}}}}{H},\dfrac{{{a_{A_{H + 1}^{n - 1}2}}}}{H}, \dfrac{{{a_{A_{H + 1}^{n - 1}3}}}}{H},\cdots,\dfrac{{{a_{A_{H + 1}^{n - 1}n}}}}{H}\} \right\} \end{split}$

2.2 构建集成模型

(1)基模型选择

(2)自适应的模型集成

 $L\left( {{w_k}} \right) = \sum\nolimits_{j = 1}^T {\sqrt {{{\left( {{R_j} - \sum\nolimits_{i = 1}^n {{w_{ki}}{Y_{ij}}} } \right)}^2}} } /T$

 图 2 集成模型算法图

2.3 模型实现

ATM现金预测模型实现的系统架构图如图3所示, 将一台ATM设备的试点开关打开后, 依次经过数据处理层、单模型训练层、组合模型预测层和个性化调优层, 即可获得ATM设备的预测结果.

 图 3 ATM现金预测系统架构图

3 动态加权组合模型验证 3.1 数据来源

3.2 数据清洗及特征处理

(1)删除重复样本. 使用duplicated()方法标记样本是否为重复样本, 得到图4(a), 可以判断出设备原始数据共有754个样本; 然后, 使用drop_duplicates()方法删除重复样本, 得到图4(b), 还剩余748个样本.

 图 4 删除重复样本操作

(2)修复缺失样本和异常样本. 使用describe()方法对删除重复样本后的数据进行一个描述分析, 查看属性的count值(非NA值的数量)、mean值(均值)、std值(标准差)、min值(最小值)、25%值(第25百分位数)、50%值(第50百分位数即中位数)、75%值(第75百分位数)和max值(最大值), 通过这些值我们可以对数据做初步的了解, 并进一步分析出缺失样本和异常样本, 如图5所示.

 图 5 数据集的描述性分析

(3)修复离群样本. 找出可能的离群样本的方式为: ① 每日轧差为0的样本; ② 箱线图法, 具体判断方法为: 存取款轧差大于箱线图上边缘值或者小于下边缘值的样本很可能为离群样本, 然后对这些样本进行探索性分析来确定是否为离群样本. 采用箱线图法挑选出原始数据中可能的离群样本. 以“年-月”为最小单位绘制设备的箱线图, 如图6所示.

 图 6 “年-月”维度的ATM现金需求量箱线图

 图 7 现金需求量变化曲线

3.3 动态加权组合模型效果

4个基模型和组合模型在验证集上的预测效果如图8所示.

 图 8 4个基模型和组合模型预测效果对比图

4个基模型和组合模型在验证集上的RMSE取值如表1所示.

 图 9 遗传算法改进的SVM模型和组合模型预测效果对比图

4 总结与展望

 [1] 刘艳杰. ATM现金流预测研究现状与展望. 科技经济导刊, 2018, 26(24): 11-12. [2] Simutis R, Dilijonas D, Bastina L, et al. A flexible neural network for ATM cash demand forecasting. Proceedings of the 6th WSEAS International Conference on Computational Intelligence, Man-Machine Systems and Cybernetics. Stevens Point, WI, USA. 2007. 162–165. [3] Acuña G, Ramirez C, Curilem M. Comparing NARX and NARMAX models using ANN and SVM for cash demand forecasting for ATM. Proceedings of 2012 International Joint Conference on Neural Networks. Brisbane, QLD, Australia. 2012. 1–6. [4] Kumar PC, Walia E. Cash forecasting: An application of artificial neural networks in finance. International Journal of Computer Science & Applications, 2006, 3(1): 61-77. [5] Venkatesh K, Ravi V, Prinzie A, et al. Cash demand forecasting in ATMs by clustering and neural networks. European Journal of Operational Research, 2014, 232(2): 383-392. DOI:10.1016/j.ejor.2013.07.027 [6] 许琪. 基于支持向量机的ATM机现金需求预测研究[硕士学位论文]. 杭州: 浙江工业大学, 2009. [7] 刘艳杰. 基于GA-SVR的ATM现金需求量预测[硕士学位论文]. 广州: 暨南大学, 2016. [8] 伍娜. 基于改进遗传神经网络的ATM现金预测的研究[硕士学位论文]. 广州: 暨南大学, 2016. [9] 韦金香, 张建同. 银行ATM设备业务总量的时序特征分析及预测. 上海管理科学, 2017, 39(6): 25-28. DOI:10.3969/j.issn.1005-9679.2017.06.005 [10] 闭小梅, 闭瑞华. KNN算法综述. 科技创新导报, 2009(14): 37. DOI:10.3969/j.issn.1674-098X.2009.14.024 [11] 蔡永川, 李琦. 基于随机森林算法的PM2.5浓度快速预测模型 . 环境工程, 2018, 36(S1): 100-104, 129. [12] Chen TQ, He T, Benesty M, et al. xgboost: Extreme gradient boosting. https://pbil.univ-lyon1.fr/CRAN/web/packages/xgboost/, 2016. [13] Gers FA, Schmidhuber J, Cummins F. Learning to forget: Continual prediction with LSTM. Proceedings of the 9th International Conference on Artificial Neural Networks ICANN ’99. Edinburgh, UK. 1999. 850–855.