﻿ 基于改进萤火虫寻优支持向量机的PM2.5预测
PM2.5 Forecasting Based on Improved Firefly Optimization SVM
FAN Wen-Ting, WANG Xiao
School of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan 030024, China
Foundation item: Doctorate Fund of Taiyuan University of Science and Technology (20152044)
Abstract: Aiming at the problem of large deviation in existing PM2.5 concentration prediction, a novel model based on Improved Firefly Algorithm optimization SVM (IFA-SVM) was proposed. In this model, two neighborhood search strategies and variable step size mechanism were employed to improve FA. The IFA was applied to optimize the SVM parameters (C, , and ), and an outstanding model was constructed to forecast PM2.5 concentrations in Taiyuan. The neighborhood search strategies can provide better candidate solutions; search step size was dynamically tuned by using variable step size strategy to accelerate convergence and obtain a trade-off between exploration and exploitation. The performance of the proposed IFA-SVM model has been compared with FA-SVM, Genetic Algorithm (GA)-SVM, and Particle Swarm Optimization (PSO)-SVM. Experimental results show that the proposed IFA-SVM model has achieved more accurate performance for PM2.5 forecasts in 1 day ahead and 3 days ahead compared to other method.
Key words: Firefly Algorithm (FA)     Support Vector Machine (SVM)     neighborhood search strategies     variable step size     parameter optimization     PM2.5 forecasting

1 PM2.5预测原理

PM2.5预测是依据气象条件、污染源等因素和历史数据建立数学模型来预测未来PM2.5值, 表示为如下非线性关系:

 $y = f\left( {{x_1}, {x_2}, \cdots, {x_n}} \right)$ (1)

2 算法理论基础 2.1 SVM基本原理

SVM基本思想是针对PM2.5非线性样本, 引入径向基(Radial Basis Function, RBF)核函数, 将样本映射到高维空间, 在高维空间求解超平面, 使得两类样本的间隔最大[21], 即求解如下约束优化问题:

 $\mathop {\min }\limits_{w, b, \xi } \frac{1}{2}{w^{\rm T}}w + C\sum\limits_{i = 1}^i {{\varepsilon _i}}$ (2)

 $K({x_i}, {x_j}) = \exp ( - \gamma {\left| {{x_i} - {x_j}} \right|^2})$ (3)

2.2 标准FA基本原理

FA受自然界萤火虫生物特性启发[22], 其对SVM参数优化思想为:

 $I = {I_0} \times {e^{ - \gamma {r_{ij}}}}$ (4)
 $\beta = {\beta _0} \times {e^{ - \gamma {r_{ij}}}}^{^2}$ (5)

 ${x_i} = {x_i} + \beta \times ({x_j} - {x_i}) + \alpha \times \left({{rand}} - \frac{1}{2}\right)$ (6)

FA主要步骤包括:

(1)根据目标函数计算萤火虫的亮度;

(2)亮度较暗的萤火虫按公式(6)向较亮的萤火虫移动;

(3)对萤火虫按亮度值从大到小排序, 找到最亮的萤火虫;

(4)重复迭代, 直到达到最大迭代次数.

2.3 基于邻域搜索策略的改进FA

 ${x_i}^1 = {\lambda _1} \times {x_i} + {\lambda _2} \times pbest + {\lambda _3} \times ({x_{i1}} - {x_{i2}})$ (7)

 ${x_i}^2 = {\lambda _4} \times {x_i} + {\lambda _5} \times gbest + {\lambda _6} \times ({x_{i3}} - {x_{i4}})$ (8)

2.4 基于可变步长的改进FA

 $\alpha = 0.4/(1 + \exp (0.015 \times (t - maxG)/3))$ (9)

3 萤火虫寻优支持向量机(FA-SVM)PM2.5预测模型 3.1 IFA-SVM PM2.5预测原理

IFA-SVM PM2.5预测过程如下:

(1)收集太原市PM2.5浓度实验数据, 划分训练集和测试集, 并归一化预处理;

(2) IFA-SVM参数迭代寻优

1)初始化算法各基本参数, 随机分布萤火虫;

2)计算萤火虫的目标函数值, 以SVM对训练集的PM2.5预测性能作为目标函数值;

3)对萤火虫的目标函数值进行亮度排序, 找到当前最优的目标函数值及其对应萤火虫, 并根据公式(6)更新萤火虫;

4)如果迭代过程中, 第t次迭代目标函数值与t–1迭代目标函数值相等, 执行2.3节中两种邻域搜索策略;

5)若达到最大迭代次数, 或满足停止迭代的条件, 则转至步骤6), 否则转至步骤2)继续迭代;

6)输出最大目标函数值及其对应的萤火虫, 即得到最优参数.

(3)使用最优参数预测测试集PM2.5值, 并将预测结果反归一化, 得到实际PM2.5预测值, 输出结果.

 图 1 IFA-SVM PM2.5预测流程

3.2 数据收集及预处理

 $x_i' = \left( {{x_i} - {x_{\min }}} \right)/\left( {{x_{\max }} - {x_{\min }}} \right)$ (10)

3.3 评价标准

 $MAE = \frac{1}{N}\sum\limits_{i = 1}^N {\left| {{O_i} - {P_i}} \right|}$ (11)
 $RMSE = \sqrt {\frac{1}{N}{{\sum\limits_{i = 1}^N {\left( {{O_i} - {P_i}} \right)} }^2}}$ (12)

3.4 参数优化

 图 2 算法对Ackley函数的收敛曲线对比

 图 3 算法对Sphere函数的收敛曲线对比

3.5 实验结果与分析 3.5.1 IFA-SVM实验结果与分析

(1)预测未来一天PM2.5浓度值;

(2)预测未来第三天PM2.5浓度值.

 图 4 IFA-SVM PM2.5预测结果对比图

3.5.2 实验结果比较与分析

4 结论与展望

(1) IFA-SVM模型对未来一天和三天的PM2.5值都可以有效预测, 由于预测误差会不断积累, 一天的预测精度更高.

(2) FA能够跳出局部最优且计算简单, FA-SVM模型比GA-SVM和PSO-SVM方法预测更准确.

(3)引入邻域搜索和可变步长策略改进FA, 可加速算法收敛, 平衡局部和全局性能, 使得IFA-SVM模型预测结果更接近实际的PM2.5变化趋势, 为雾霾预测提供了一种新思路.

