﻿ 基于不等距超平面距离的模糊支持向量机
Fuzzy Support Vector Machine Algorithm Based on Inequality Hyper-Plane Distance
LI Cun-He, JIANG Yu, LI Shuai
Computer Science and Technology, China University of Petroleum, Qingdao 266580, China
Foundation item: Natural Science Foundation of Shandong Province (ZR2014FQ018)
Abstract: In the age of the big data and artificial intelligence, Support Vector Machine (SVM) has been successfully applied in many aspects and becomes one of the common methods to solve classification problems. But the real world data is usually imbalanced, making its performance of classification significantly decreased. This study proposes to improve original standard Fuzzy Support Vector Machine (FSVM) by using inequality hyper-plane distance. The algorithm introduces parameter λ to controls the distance between hyper-plane and categories, and constructs fuzzy membership function by calculating sample mutually center distance, which can improve the falling precision of classification caused by imbalanced distribution of sample and noise data. The effectiveness of the proposed algorithm is verified by experiments, and the result shows that the proposed algorithm has a better effect of imbalanced data.
Key words: Support Vector Machine (SVM)     imbalanced data     inequality hyper-plane distance     membership function

1 相关工作

2 不等距超平面距离改进的模糊支持向量机(IFD-FSVM)

 $\min\;\;\;\frac{1}{2}||\omega |{|^2} + C\sum\limits_{i = 1}^l {{u_i}} {\varepsilon _i}$ (1)

 $\left\{ {\begin{array}{*{20}{l}} {{y_i}({\omega ^{\rm{T}}}{x_i} + b) \ge 1 - {\varepsilon _i}}\\ {{\varepsilon _i} \ge 0,\;i = 1, \cdots ,l} \end{array}} \right.$ (2)

 $f({\rm{x}}) = sign\left\{ {\displaystyle\sum\limits_{i = 1}^m {\alpha _i^*{y_i}K\left( {x,{x_i}} \right) + {b^*}} } \right\}$ (3)
 $0 \le {a_i} \le {\mu _i}C\;\;\;i = 1,\cdots,l$ (4)

$K\left( {x,{x_i}} \right)$ 为核函数, 常见的核函数有线性核函数、多项式核函数、高斯核函数等, 在求解过程中核函数的选择要视数据集性质而定.

 $\min\;\;\;\frac{1}{{1 + \lambda }}||\omega |{|^2} + C\sum\limits_{i = 1}^l {{u_i}} {\varepsilon _i}$ (5)

 $\left\{ \begin{array}{l} {y_i}[({\omega ^{\rm T}}{x_i}) + b] - \lambda + {\varepsilon _i} \ge 0\;\;\;{y_i} = 1 \\ {y_i}[({\omega ^{\rm T}}{x_i}) + b] + 1 + {\varepsilon _i} \le 0\;\;\;{y_i} = - 1 \\ {\varepsilon _i} \ge 0\;\;\;i = 1,\cdots,l \\ \end{array} \right.$ (6)

 $\begin{split} L(\omega ,b,\varepsilon ,a,\beta ) = & \frac{1}{{1 + \lambda }}||\omega |{|^2} + C \displaystyle \sum\limits_{i = 1}^l {{u_i}} {\varepsilon _i} \\ &- \left\{ { \displaystyle \sum\limits_{{{\rm{y}}_i} = 1}^{} {{a_i}} {y_i}[({\omega ^{\rm T}}{x_i} + b) - \lambda + {\varepsilon _{\rm{i}}}] } \right.\\ &+\left. {\displaystyle \sum\limits_{{y_i} = - 1} {{a_i}{y_i}[({\omega ^{\rm T}}x{}_i + b) - } {1 \rm{ + }}{\varepsilon _i}]} \right\} - \displaystyle \sum\limits_{i = 1}^l {{\beta _i}} {\varepsilon _i} \\ \end{split}$ (7)

 $\left\{ \begin{array}{l} \dfrac{{\partial L(\omega ,b,\varepsilon ,a,\beta )}}{{\partial \omega }} = \dfrac{1}{{2(1 + \lambda )}}\omega - \displaystyle\sum\limits_{i = 1}^l {{a_i}} {x_i}{y_i} = 0\\ \dfrac{{\partial L(\omega ,b,\varepsilon ,a,\beta )}}{{\partial b}} = \displaystyle\sum\limits_{i = 1}^l {{a_i}} {y_i} = 0\\ \dfrac{{\partial L(\omega ,b,\varepsilon ,a,\beta )}}{{\partial {\varepsilon _i}}} = {u_i}C - {a_i} - \beta = 0 \end{array} \right.$ (8)

 $\max\;\;\;\sum\limits_{{y_i} = 1} {\lambda {a_i}} + \sum\limits_{{y_i} = - 1} {{a_i}} - \frac{{1 + \lambda }}{4}\sum\limits_{i = 1}^l {\sum\limits_{j = 1}^l {{y_i}} } {y_j}{a_i}{a_j}({x_i},{x_j})$ (9)

$\displaystyle \sum\limits_{i = 1}^l {{y_i}} {a_i} = 0$ 得到于 $\displaystyle \sum\limits_{j = 1} {{a_i}} = \sum\limits_{j = - 1} {{a_i}}$ , 化简式(9)得到:

 $\begin{split} & \displaystyle\sum\limits_{{y_i} = 1} {\lambda {a_i}} + \displaystyle \sum\limits_{{y_i} = - 1} {{a_i}} - \frac{{1 + \lambda }}{4}\sum\limits_{i = 1}^l {\sum\limits_{j = 1}^l {{y_i}} } {y_j}{a_i}{a_j}({x_i}*{x_j}) = \\ & \frac{{1 + \lambda }}{2}\left\{ \displaystyle \sum\limits_{i = 1}^l {{a_i}} - \frac{1}{2}\sum\limits_{i = 1}^l {\sum\limits_{j = 1}^l {{y_i}} } {y_j}{a_i}{a_j}{\rm{(}}{x_i}*{x_j}{\rm{)}}\right\} \\ \end{split}$ (10)

 $f(x) = sign\left\{ \frac{{1 + \lambda }}{2}[\sum\limits_{{x_i} \in SV} {{a_i}} {y_i}K({x_i},x) + {b^{\rm{*}}}]\right\}$ (11)
 $0 \le {a_i} \le {u_i}C\;\;\;i = 1,\cdots,l$ (12)

$\lambda$ 的值影响超平面与类之间的空间距离, 若0< $\lambda$ <1, 则超平面与正类间的空间距离较小; 若 $\lambda$ >1, 则超平面与负类之间的空间距离较小; 若 $\lambda$ =1该算法等同于标准的模糊支持向量机.

3 确定隶属度函数

 图 1 样本的空间分布

 $m = \frac{1}{n}\sum\limits_{i = 1}^n {\varphi ({x_i})}$ (13)
 $\begin{array}{*{20}{l}} {d = |{m_ + } - {m_ - }| = {{\left( {\left| {\displaystyle\sum\nolimits_{{x_ + } \in {X_ + }} {\varphi ({x_ + })} /{n_ + } - \displaystyle\sum\nolimits_{{x_ - } \in {X_ - }} {\varphi ({x_ - })} /{n_ - }} \right|_2^2} \right)}^{\dfrac{1}{2}}} }\\ ={{{\left( {{\displaystyle\sum _{\begin{array}{*{20}{c}} {{x_{ + i}} \in {X_ + }}\\ {{x_{ + j}} \in {X_ + }} \end{array}}}K({x_{ + i}},{x_{ + j}})/n_ + ^2 + {\displaystyle\sum _{\begin{array}{*{20}{c}} {{x_{ - p}} \in {X_ - }}\\ {{x_{ - q}} \in {X_ - }} \end{array}}}K({x_{ - p}},{x_{ - q}})/n_ - ^2 - 2{\displaystyle\sum _{\begin{array}{*{20}{c}} {{x_{ + m}} \in {X_ + }}\\ {{x_{ - n}} \in {X_ - }} \end{array}}}K({x_{ + m}},{x_{ - n}})/{n_ + }{n_ - }} \right)}^{\dfrac{1}{2}}}} \end{array}$ (14)
 $\begin{array}{l} d_{ip}^ + = |\varphi ({x_ + }) - {m_ + }| = {\left( {\left| {\varphi ({x_ + }) - \displaystyle \sum\limits_{{x_{_ + }} \in {X_ + }} {\varphi ({x_ + })} /{n_ + }} \right|_2^2} \right)^{\dfrac{1}{2}}} {\rm{ = }} \left( {K({x_ + },{x_ + }) + \displaystyle \sum\limits_{\begin{array}{*{20}{c}} {{x_{ + p}} \in {X_ + }}\\ {{x_{ + q}} \in {X_ + }} \end{array}} {K({x_{ + p}},{x_{ + q}})} /n_ + ^2 - 2 \displaystyle \sum\limits_{{x_{ + m}} \in {X_ + }} {K({x_ + },{x_{ + m}})} /{n_ + }} \right)^{\dfrac{1}{2}} \end{array}$ (15)
 $\begin{array}{l} d_{ip}^ - = |\varphi ({x_ + }) - {m_ - }| = {\left( {\left| {\varphi ({x_ + }) - \displaystyle \sum\limits_{{x_{_ + }} \in {X_ - }} {\varphi ({x_ - })} /{n_ - }} \right|_2^2} \right)^{\dfrac{1}{2}}} {\rm{ = }} \left( {K({x_ + },{x_ + }) + \displaystyle \sum\limits_{\begin{array}{*{20}{c}} {{x_{ - p}} \in {X_ - }}\\ {{x_{ - q}} \in {X_ - }} \end{array}} {K({x_{ - p}},{x_{ - q}})} /n_ - ^2 - {{2 \displaystyle \sum\limits_{{x_{ - m}} \in {X_ - }} {K({x_ + },{x_{ - m}})} } / {{n_ - }}}} \right)^{\dfrac{1}{2}} \end{array}$ (16)
 $\begin{array}{l} d_{in}^ - = |\varphi ({x_ - }) - {m_ - }| = {\left( {\left| {\varphi ({x_ - }) - \displaystyle \sum\limits_{{x_{_ - }} \in {X_ - }} {\varphi ({x_ - })} /{n_ - }} \right|_2^2} \right)^{\dfrac{1}{2}}} {\rm{ = }} \left( {K({x_ - },{x_ - }) + \displaystyle \sum\limits_{\begin{array}{*{20}{c}} {{x_{ - p}} \in {X_ - }}\\ {{x_{ - q}} \in {X_ - }} \end{array}} {K({x_{ - p}},{x_{ - q}})} /n_ - ^2 - 2 \displaystyle \sum\limits_{{x_{ - m}} \in {X_ - }} {K({x_ - },{x_{ - m}})} /{n_ - }} \right)^{\dfrac{1}{2}} \end{array}$ (17)
 $\begin{array}{l} d_{in}^ + = |\varphi ({x_ - }) - {m_ + }| = {\left( {\left| {\varphi ({x_ - }) - \displaystyle \sum\limits_{{x_{_ + }} \in {X_ + }} {\varphi ({x_ + })} /{n_ + }} \right|_2^2} \right)^{\dfrac{1}{2}}} {\rm{ = }} \left( {K({x_ - },{x_ - }) + \displaystyle \sum\limits_{\begin{array}{*{20}{c}} {{x_{ + p}} \in {X_ + }}\\ {{x_{ + q}} \in {X_ + }} \end{array}} {K({x_{ + p}},{x_{ + q}})} /n_ + ^2 - 2\displaystyle \sum\limits_{{x_{ + m}} \in {X_ + }} {K({x_ - },{x_{ + m}})} /{n_ + }} \right)^{\dfrac{1}{2}} \end{array}$ (18)

1) 计算样本中心点之间的距离 $\scriptstyle d$ ,计算正类样本的互距离 $\scriptstyle d_{ip}^ -$ ;

2) 比较样本距离与样本互距离的大小:若 $\scriptstyle d_{ip}^ - > d$ , 样本大都位于图1深色区域外部分,若 $\scriptstyle d_{ip}^ - \le d$ , 样本大都位于图1深色区域内部分;

3) 取 $\scriptstyle d_{ip}^ - \le d$ 的样本点计算其 $\scriptstyle d_{ip}^ +$ , 将其中的最大值记为 $\scriptstyle {R^ + }$ .

4) 同理得到负类样本的 $\scriptstyle {R^ - }$ .

 $\left\{ \begin{array}{l} {s_{ip}} = {{d_{ip}^ + } / {{R_ + }}},d_{ip}^ - \le d\\ {s_{ip}} = \delta ,d_{ip}^ - > d \end{array} \right.{y_i} = 1$ (19)
 $\left\{ \begin{array}{l} {s_{in}} = {{d_{in}^ - } / {{R_ - }}},d_{in}^ + \le d\\ {s_{in}} = \delta ,d_{in}^ + > d \end{array} \right.{y_i} = - 1$ (20)
4 实验结果与分析

TN代表实为负类且分类结果为负类的样本, FN代表实为负类但分类结果为正类的样本,FP代表实为正类但分类结果为负类的样本.

SVM算法: 等距超平面且没有将隶属度函数应用于支持向量机.

FSVM算法: 等距超平面线性隶属度函数的模糊支持向量机.

IFD-FSVM算法: 应用不等距超平面距离的改进模糊支持向量机.

 图 2 负类样本分类准确率对比

 图 3 负类样本回归率对比

 图 4 正类样本分类准确率对比

5 结论与展望

