聚类算法已经在经济学, 计算机科学, 天文学等各个领域得到广泛应用[1,2]. 传统的聚类算法是根据准确的数值对确定的对象进行划分的, 但是随着社会的进步, 模糊数据、糊模模型成为了一种新的趋势, 这意味着传统的硬划分聚类方法也要逐渐转向软划分聚类方法[3]. 研究模糊聚类的前提是要引入模糊集理论, 因为模糊聚类是基于模糊集进行划分的. Zadeh[4]首先引入模糊语言学理论, 然后将模糊集应用于多标准决策(MCDM)问题中, 称之为模糊MCDM. 之后, Torra[5]提出了犹豫模糊集(HFSs), 它允许使用多个属于[0, 1]范围的值来评估一个属性, 增强了模糊性. 然而, 在实际问题中, 我们更多的时候得到的数据是定性信息, 不是定量值[6,7]. 例如, 当人们评估汽车的性能时, 他们可能会更偏向于使用“差”, “好”, “非常好”等语言术语来表达他们的评估结果. 因此, Zadeh提出了采用模糊语言学方法对评估信息进行建模的思想, 最典型的模型有: 二类模糊集合模型[8], 二元语言模型[9]和虚拟语言模型[10]. 这些语言模型的缺陷是: 它们要求一个对象的一个属性只能对应一个语言术语[11]. 基于犹豫模糊集思想和模糊语言学方法, Rodríguez等人[12]提出了HFLTSs的概念, 它允许一个对象的一个属性可以用多个语言术语来描述, 提高了评估属性的灵活性.
目前已经存在许多关于模糊聚类的研究, 比如, 文献[13]和文献[14]提出了基于直觉模糊集(IFSs)的聚类方法; 文献[15]提出了犹豫模糊环境下的最小生成树(MST)聚类方法; 文献[16]通过计算犹豫模糊集的相关系数得到相关系数矩阵, 然后构造相关系数矩阵的等价矩阵, 最后, 基于λ置信值切割矩阵得到聚类结果; 文献[17]提出了一种层次犹豫模糊k-means聚类方法, 以层次聚类的结果作为k-means的初始聚类中心进行迭代以获得最终聚类结果, 该算法减少了k-means的迭代次数, 计算成本和聚类时间; 近期, 文献[18]将文献[16]的方法扩展到犹豫模糊积性集(HMSs)上使用, 并取得了一定的成果; 文献[19]则提出了一种基于犹豫模糊环境下的正交聚类算法. 但是目前还没有比较成熟的基于HFLTSs的聚类方法, 而HFLTSs在实际应用中较HFSs、IFSs及HMSs的使用更加广泛且灵活性更大, 因此, 本文针对HFLTSs提出了一种新的正交模糊聚类算法.
1 理论基础 1.1 犹豫模糊语言术语集定义1[12]. 设
${{\rm {H_S}}} = \{ {s_i},{s_{i + 1}},\cdots,{s_j}\} $ |
其中,
HFLTSs的上下边界分别是
(1) 如果
(2)
备注1. 在定义1中, HFLTSs是一些离散的数值, 为了避免丢失语言信息, 可以将离散形式扩展为连续形式, 即,
文献[20]提出了一种上下文无关文法, 我们可以将一些简单而丰富的语言表达通过转换函数[21]转换成HFLTSs.
定义2. 假设
${E_{{G_{\rm H}}}}:{{\rm S}}_{ll} \to {{\rm {H_S}}}$ |
具体转换过程如下:
(1)
(2)
(3)
(4)
(5)
(6)
例1. S={极差, 很差, 差, 一般, 好, 很好, 极好}作为一本书的语言术语集, 假设一位评估者给出的对三本书的三个属性的评估结果如下:
${\text{评估者}} \!=\! \left(\!\!\!\!\! {\begin{array}{*{20}{c}}{{\text{一般}}} & {{\text{介于好和很好之间}}} & {{\text{非常好}}}\\{{\text{顶多为差等级}}} & {{\text{在好之上}}} & {{\text{好}}}\\{{\text{至少评为好}}} & {{\text{介于很坏和坏之间}}} & {{\text{坏}}}\end{array}}\!\!\!\!\! \right)$ |
通过
${\text{评估者}} = \left(\!\!\!{\begin{array}{*{20}{c}}{\left\{ {{\text{一般}}} \right\}} & {\left\{ {{\text{好}},\;{\text{很好}}} \right\}} & {\left\{ {{\text{非常好}}} \right\}}\\[3pt]{\left\{ {{\text{极差}},\;{\text{很差}},\;{\text{差}}} \right\}} & {\left\{ {{\text{很好}},\;{\text{极好}}} \right\}} & {\left\{ {{\text{好}}} \right\}}\\[3pt]{\left\{ {{\text{好}},\;{\text{很好}},\;{\text{极好}}} \right\}} & {\left\{ {{\text{很坏}},\;{\text{坏}}} \right\}} & {\left\{ {{\text{坏}}} \right\}}\end{array}}\!\!\! \right)$ |
距离测度是聚类分析的重要指标之一[22], 本节将介绍基于HFLTSs的传统距离测度以及改进之后的距离测度.
2.1 传统距离测度定义3[23]. 设
${d_{gd}}({\rm {H_S}}^1,{\rm {H_S}}^2) = {\left( {\frac{1}{L}\sum\limits_{l = 1}^L {{{\left( {\frac{{|\delta _l^1 - \delta _l^2|}}{{2\tau + 1}}} \right)}^\lambda }} } \right)^{1/\lambda }}$ | (1) |
当
${d_{hd}}({\rm {H_S}}^1,{\rm {H_S}}^2) = \frac{1}{L}\sum\limits_{l = 1}^L {\frac{{|\delta _l^1 - \delta _l^2|}}{{2\tau + 1}}} $ | (2) |
当
${d_{ed}}({\rm {H_S}}^1,{\rm {H_S}}^2) = {\left( {\frac{1}{L}\sum\limits_{l = 1}^L {{{\left( {\frac{{|\delta _l^1 - \delta _l^2|}}{{2\tau + 1}}} \right)}^2}} } \right)^{1/2}}$ | (3) |
传统距离测度公式要求两个HFLTSs的HFLTEs个数一样, 而实际上如例1所示, 两个不同的HFLTSs的HFLTEs个数可能不同. 因此, 传统距离测度采用最大值、最小值或者平均值来补齐HFLTEs个数较少的HFLTSs, 使HFLTEs的个数一致[24].
例2. 设
$\begin{aligned}{d_{ed}}({\rm {H_S}}^1,{\rm {H_S}}^2) & =\! \sqrt {\frac{1}{3}\left( {{{\left( {\frac{{|1 \!-\! ( - 1)|}}{7}} \right)}^2} \!\!+\! {{\left( {\frac{{|1.5 \!-\! 0|}}{7}} \right)}^2} \!\!+\! {{\left( {\frac{{|2 - 1|}}{7}} \right)}^2}} \right)} \\ & = 0.222\end{aligned}$ |
传统距离测度方法, 涉及到有多个HFLTSs时, 是对这些HFLTEs个数进行两两对比, 得到距离测度.
例3. 设
计算
针对上面提到的传统距离测度存在的缺陷, 本文对其做出改进, 重新定义如定义4.
定义4. 设
${d_{gd}}({\rm {H_S}}^1,{\rm {H_S}}^2) = {\left( {\frac{1}{L}\sum\limits_{l = 1}^L {{{\left( {\frac{{|\delta _l^1 - \delta _l^2|}}{{2\tau + 1}}} \right)}^\lambda }} } \right)^{1/\lambda }}$ | (4) |
例4. 设
计算
HFLTSs 的传统距离测量只考虑了HFLTEs的值的差异, 而不考虑HFLTEs的个数差异. 文献[25]在距离测度中考虑到了犹豫度这个影响因素, 提高了计算HFSs的距离测度的准确性和可靠性. 文献[26]受此启发, 也在HFLTSs的距离测度公式中考虑犹豫度对其的影响, 提出了新的距离测度公式.
定义5[20]. 设
$\mu ({{\rm {H_S}}}) = \frac{{\frac{3}{{|{{\rm {H_S}}}|}}\sum\limits_{l = 1}^{|{{\rm {H_S}}}|} {{{({\delta _l} - \overline \delta )}^2}} }}{{\tau (\tau + 1)}}$ | (5) |
其中,
定义6[25]. 设
$\begin{aligned}{d_{dg}}({\rm {H_S}}^1,{\rm {H_S}}^2) = & \left( {\alpha |\mu ({\rm {H_S}}^1) - \mu ({\rm {H_S}}^2){|^\lambda }} \right.\\&{\left. { + \beta \left( {\frac{1}{L}\sum\limits_{l = 1}^L {{{\left( {\displaystyle\frac{{|\delta _l^1 - \delta _l^2|}}{{2\tau + 1}}} \right)}^\lambda }} } \right)} \right)^{^{1/\lambda }}}\end{aligned}$ | (6) |
其中,
将上述距离测度扩展到多个属性的情况, 则定义为如定义7所示形式.
定义7. 设
$\begin{aligned}{d_{ngd}}({\rm {H_S}}^1,{\rm {H_S}}^2) = & \frac{1}{n}\sum\limits_{i = 1}^n {\left( {\alpha |\mu ({\rm {H_S}}^1({x_i})) - \mu ({\rm {H_S}}^2({x_i})){|^\lambda }} \right.} \\& + {\left. {\beta \left( {\displaystyle\frac{1}{{L({x_i})}}\sum\limits_{l = 1}^{L({x_i})} {{{\left( {\displaystyle\frac{{|\delta _l^{1i} - \delta _l^{2i}|}}{{2\tau + 1}}} \right)}^\lambda }} } \right)} \right)^{1/\lambda }}\end{aligned}$ | (7) |
其中,
近来, 文献[19]提出了基于HFSs的正交聚类算法, 简化了聚类过程, 降低了算法复杂度, 提高了算法的效率. 该算法的步骤如下.
算法1. 基于HFSs正交模糊聚类算法.
步骤1. 设
步骤2. 选择置信水平
步骤3.
步骤4. 根据列向量之间的正交关系对样本进行聚类, 具体原理如下:
如果
如果存在
如果
为了说明计算的复杂性, 文献[19]随机生成一些HFSs用以对比正交模糊聚类算法和模糊网络聚类算法. 表1是两种聚类方法得到聚类结果之前的运行时间, 显然, 正交模糊聚类算法消耗更少的时间.
但是该算法存在一个缺陷, 当样本数量大时, 会得到一个非常高维的距离测度矩阵, 如果矩阵中的所有不相同的值都作为置信水平对距离测度矩阵进行
针对算法1存在的问题, 如果我们可以解决样本数量大带来的高维矩阵难以计算的问题, 那么就可以进一步降低计算复杂度. 本文采取的解决方法是减少距离测度矩阵内部元素的差异性, 以此缩小置信水平
基于HFLTSs的正交模糊聚类算法过程如算法2.
算法2. 基于HFLTSs正交模糊聚类算法.
步骤1. 设
步骤2. 计算距离测度矩阵
直到
步骤3. 按照从大到小的顺序从
步骤4.
步骤5. 根据列向量之间的正交关系对样本进行聚类, 得到聚类结果.
K-means算法是常用的聚类算法, 该算法需要给定k值用以指定将目标对象划分成k个类别. 算法的第一个步骤是要计算初始数据的质心, 然后计算数据到质心的距离进而得到新的集群质心, 不断迭代这个过程, 直到质心的位置不再变化, 即聚类结束. 该算法精度高, 是最为广泛使用的聚类算法之一, 但是k-means的效率高低很大程度上依靠于对k值和初始质心的选择, 选择不当往往造成迭代次数多, 计算量大, 消耗时间成本大的问题, 因此本文只借助它的优点来对本文提出的算法结果的准确性进行验证.
本文将算法2的聚类结果作为k-means的初始数据, 代入到k-means算法中, 进行一次迭代运算, 求得迭代之后的聚类结果, 如果该结果与算法2的聚类结果一致, 则说明聚类结果准确.
4 实例分析聚类分析在各行各业的应用十分常见, 对顾客进行细分是最为常见的分析需求, 本文以顾客细分为例, 验证本文提出的正交模糊聚类算法的可行性和高效性.
设某公司要对自己的客户进行划分, 划分客户的主要参考因素为以下5个: (1)消费水平
步骤1. 将得到的评估信息进行规范化, 即为元素较少的HFLTSs补齐元素, 使HFLTEs个数一致:
${\left\{ {{g_{ij}}} \right\}_{m \times n}} = \left( \begin{array}{*{20}{c}}\left\{ { {s_0},{s_{0.5}},{s_1}} \right\} & \left\{ { {s_{ - 2}},{s_{ - 1.5}},{s_{ - 1}}} \right\} & \left\{ { {s_0},{s_{0.5}},{s_1}} \right\} & \left\{ { {s_1},{s_{1.5}},{s_2}} \right\} & \left\{ { {s_1},{s_2},{s_3}} \right\}\\\left\{ { {s_2},{s_2},{s_2}} \right\} & \left\{ { {s_{ - 1}},{s_0},{s_1}} \right\} & \left\{ { {s_{ - 2}},{s_{ - 1.5}},{s_{ - 1}}} \right\} & \left\{ { {s_1},{s_1},{s_1}} \right\} & \left\{ { {s_1},{s_{1.5}},{s_2}} \right\} \\\left\{ { {s_0},{s_{0.5}},{s_1}} \right\} & \left\{ { {s_0},{s_{0.5}},{s_1}} \right\} & \left\{ { {s_0},{s_1},{s_2}} \right\} & \left\{ { {s_1},{s_2},{s_3}} \right\} & \left\{ { {s_0},{s_1},{s_2}} \right\} \\\left\{ { {s_{ - 2}},{s_{ - 1.5}},{s_{ - 1}}} \right\} & \left\{ { {s_{ - 2}},{s_{ - 1.5}},{s_{ - 1}}} \right\} & \left\{ { {s_1},{s_{1.5}},{s_2}} \right\} & \left\{ { {s_3},{s_3},{s_3}} \right\} & \left\{ { {s_{ - 1}},{s_0},{s_1} } \right\}\\\left\{ { {s_1},{s_{1.5}},{s_2}} \right\} & \left\{ { {s_1},{s_{1.5}},{s_2}} \right\} & \left\{ { {s_0},{s_{0.5}},{s_1}} \right\} & \left\{ { {s_2},{s_{2.5}},{s_3}} \right\} & \left\{ { {s_0},{s_{0.5}},{s_1}} \right\} \\\left\{ { {s_{ - 1}},{s_0},{s_1} } \right\} & \left\{ { {s_1},{s_1},{s_1}} \right\} & \left\{ { {s_1},{s_{1.5}},{s_2}} \right\} & \left\{ { {s_1},{s_2},{s_3}} \right\} & \left\{ { {s_{ - 1}},{s_0},{s_1} } \right\}\\\left\{ { {s_1},{s_2},{s_3}} \right\} & \left\{ { {s_0},{s_1},{s_2}} \right\} & \left\{ { {s_{ - 2}},{s_{ - 1.5}},{s_{ - 1}}} \right\} & \left\{ { {s_0},{s_1},{s_2}} \right\} & \left\{ { {s_1},{s_{1.5}},{s_2}} \right\} \\\left\{ { {s_0},{s_{0.5}},{s_1}} \right\} & \left\{ { {s_{ - 3}},{s_{ - 2.5}},{s_{ - 2}}} \right\} & \left\{ { {s_0},{s_1},{s_2}} \right\} & \left\{ { {s_1},{s_2},{s_3}} \right\} & \left\{ { {s_1},{s_{1.5}},{s_2}} \right\} \\\left\{ { {s_1},{s_{1.5}},{s_2}} \right\} & \left\{ { {s_{ - 1}},{s_0},{s_1} } \right\} & \left\{ { {s_0},{s_{0.5}},{s_1}} \right\} & \left\{ { {s_2},{s_{2.5}},{s_3}} \right\} & \left\{ { {s_{ - 3}},{s_{ - 2.5}},{s_{ - 2}}} \right\} \\\left\{ { {s_0},{s_{0.5}},{s_1}} \right\} & \left\{ { {s_{ - 1}},{s_0},{s_1} } \right\} & \left\{ { {s_0},{s_1},{s_2}} \right\} & \left\{ { {s_1},{s_{1.5}},{s_2}} \right\} & \left\{ { {s_0},{s_1},{s_2}} \right\} \end{array} \right)$ |
步骤2. 根据距离测量公式(7)计算样本之间的距离测度, 其中
${\rm D} = \left( \begin{array}{*{20}{c}}{0} & {0.2274} & {0.1158} & {0.2222} & {0.2578} & {0.2354} & {0.3091} & {0.0486} & {0.3982} & {0.0791} \\{0.2274} & {0} & { 0.2707} & {0.6852} & {0.2146} & {0.4478} & {0.0533} & {0.3829} & {0.3892} & {0.2394} \\{0.1158} & {0.2707} & {0} & { 0.2355} & {0.0736} & {0.0506} & {0.2593} & {0.1878} & {0.2441} & {0.0189} \\{0.2222} & {0.6852} & {0.2355} & {0} & { 0.4450} & {0.2201} & {0.7787} & {0.1983} & {0.4054} & {0.2163} \\{0.2578} & {0.2146} & {0.0736} & {0.4450} & {0} & { 0.1132} & {0.1762} & {0.3808} & {0.1854} & {0.1106} \\{0.2354} & {0.4478} & {0.0506} & {0.2201} & {0.1132} & {0} & { 0.3945} & {0.3134} & {0.2303} & {0.0861} \\{0.3091} & {0.0533} & {0.2593} & {0.7788} & {0.1762} & {0.3945} & {0} & { 0.4915} & {0.4108} & {0.2610} \\{0.0486} & {0.3829} & {0.1878} & {0.1983} & {0.3808} & {0.3134} & {0.4915} & {0} & { 0.4163} & {0.1454} \\{0.3982} & {0.3892} & {0.2441} & {0.4054} & {0.1854} & {0.2303} & {0.4108} & {0.4163} & {0} & { 0.2402} \\{0.0791} & {0.2394} & {0.0189} & {0.2163} & {0.1106} & {0.0861} & {0.2610} & {0.1454} & {0.2402} & {0} \\\end{array} \right)$ |
步骤3. 计算距离测量矩阵
${{\rm D}^2} = \left( \begin{array}{*{20}{c}}{0 } & { 0.2274} & {0.0791} & {0.1983} & {0.1106} & {0.0861} & {0.2274} & {0.0486} & {0.2354} & {0.0791} \\[2pt]{0.2274} & {0 } & { 0.2146} & {0.2274} & {0.1762} & {0.2146} & {0.0533} & {0.2274} & {0.2146} & {0.2146} \\[2pt]{0.0791} & {0.2146} & {0 } & { 0.1983} & {0.0736} & {0.0506} & {0.1762} & {0.1158} & {0.1854} & {0.0189} \\[2pt]{0.1983} & {0.2274} & {0.1983} & {0 } & { 0.2163} & {0.2163} & {0.2593} & {0.1983} & {0.2303} & {0.1983} \\[2pt]{0.1106} & {0.1762} & {0.0736} & {0.2163} & {0 } & { 0.0736} & {0.1762} & {0.1454} & {0.1854} & {0.0736} \\[2pt]{0.0861} & {0.2146} & {0.0506} & {0.2163} & {0.0736} & {0 } & { 0.1762} & {0.1454} & {0.1854} & {0.0506} \\[2pt]{0.2274} & {0.0533} & {0.1762} & {0.2593} & {0.1762} & {0.1762} & {0 } & { 0.2593} & {0.1854} & {0.1762} \\[2pt]{0.0486} & {0.2274} & {0.1158} & {0.1983} & {0.1454} & {0.1454} & {0.2593} & {0 } & { 0.2402} & {0.0791} \\[2pt]{0.2354} & {0.2146} & {0.1854} & {0.2303} & {0.1854} & {0.1854} & {0.1854} & {0.2402} & {0 } & { 0.1854} \\[2pt]{0.0791} & {0.2146} & {0.0189} & {0.1983} & {0.0736} & {0.0506} & {0.1762} & {0.0791} & {0.1854} & {0} \\[2pt]\end{array} \right)$ |
${{\rm D}^4} = \left( \begin{array}{*{20}{c}}{0} & {0.1762} & {0.0791} & {0.1983} & {0.0791} & {0.0791} & {0.1762} & {0.0486} & {0.1854} & {0.0791} \\[1pt]{0.1762} & {0} & {0.1762} & {0.2146} & {0.1762} & {0.1762} & {0.0533} & {0.1762} & {0.1854} & {0.1762} \\[1pt]{0.0791} & {0.1762} & {0} & {0.1983} & {0.0736} & {0.0506} & {0.1762} & {0.0791} & {0.1854} & {0.0189} \\[1pt]{0.1983} & {0.2146} & {0.1983} & {0} & {0.1983} & {0.1983} & {0.1983} & {0.1983} & {0.1983} & {0.1983} \\[1pt]{0.0791} & {0.1762} & {0.0736} & {0.1983} & {0} & {0.0736} & {0.1762} & {0.0791} & {0.1854} & {0.0736} \\[1pt]{0.0791} & {0.1762} & {0.0506} & {0.1983} & {0.0736} & {0} & {0.1762} & {0.0791} & {0.1854} & {0.0506} \\[1pt]{0.1762} & {0.0533} & {0.1762} & {0.1983} & {0.1762} & {0.1762} & {0} & {0.1762} & {0.1854} & {0.1762} \\[1pt]{0.0486} & {0.1762} & {0.0791} & {0.1983} & {0.0791} & {0.0791} & {0.1762} & {0} & {0.1854} & {0.0791} \\[1pt]{0.1854} & {0.1854} & {0.1854} & {0.1983} & {0.1854} & {0.1854} & {0.1854} & {0.1854} & {0} & {0.1854} \\[1pt]{0.0791} & {0.1762} & {0.0189} & {0.1983} & {0.0736} & {0.0506} & {0.1762} & {0.0791} & {0.1854} & {0} \\[1pt]\end{array} \right)$ |
${{\rm D}^8} = \left( \begin{array}{*{20}{c}}{0} & {0.1762} & {0.0791} & {0.1983} & {0.0791} & {0.0791} & {0.1762} & {0.0486} & {0.1854} & {0.0791} \\{0.1762} & { 0} & {0.1762} & {0.1983} & {0.1762} & {0.1762} & {0.0533} & {0.1762} & {0.1854} & {0.1762} \\{0.0791} & {0.1762} & { 0} & {0.1983} & {0.0736} & {0.0506} & {0.1762} & {0.0791} & {0.1854} & {0.0189} \\{0.1983} & {0.1983} & {0.1983} & { 0} & {0.1983} & {0.1983} & {0.1983} & {0.1983} & {0.1983} & {0.1983} \\{0.0791} & {0.1762} & {0.0736} & {0.1983} & { 0} & {0.0736} & {0.1762} & {0.0791} & {0.1854} & {0.0736} \\{0.0791} & {0.1762} & {0.0506} & {0.1983} & {0.0736} & { 0} & {0.1762} & {0.0791} & {0.1854} & {0.0506} \\{0.1762} & {0.0533} & {0.1762} & {0.1983} & {0.1762} & {0.1762} & { 0} & {0.1762} & {0.1854} & {0.1762} \\{0.0486} & {0.1762} & {0.0791} & {0.1983} & {0.0791} & {0.0791} & {0.1762} & { 0} & {0.1854} & {0.0791} \\{0.1854} & {0.1854} & {0.1854} & {0.1983} & {0.1854} & {0.1854} & {0.1854} & {0.1854} & { 0} & {0.1854} \\{0.0791} & {0.1762} & {0.0189} & {0.1983} & {0.0736} & {0.0506} & {0.1762} & {0.0791} & {0.1854} & { 0} \\\end{array} \right)$ |
${{\rm D}^{16}} = \left( \begin{array}{*{20}{c}}{0} & {0.1762} & {0.0791} & {0.1983} & {0.0791} & {0.0791} & {0.1762} & {0.0486} & {0.1854} & {0.0791} \\{0.1762} & { 0} & {0.1762} & {0.1983} & {0.1762} & {0.1762} & {0.0533} & {0.1762} & {0.1854} & {0.1762} \\{0.0791} & {0.1762} & { 0} & {0.1983} & {0.0736} & {0.0506} & {0.1762} & {0.0791} & {0.1854} & {0.0189} \\{0.1983} & {0.1983} & {0.1983} & { 0} & {0.1983} & {0.1983} & {0.1983} & {0.1983} & {0.1983} & {0.1983} \\{0.0791} & {0.1762} & {0.0736} & {0.1983} & { 0} & {0.0736} & {0.1762} & {0.0791} & {0.1854} & {0.0736} \\{0.0791} & {0.1762} & {0.0506} & {0.1983} & {0.0736} & { 0} & {0.1762} & {0.0791} & {0.1854} & {0.0506} \\{0.1762} & {0.0533} & {0.1762} & {0.1983} & {0.1762} & {0.1762} & { 0} & {0.1762} & {0.1854} & {0.1762} \\{0.0486} & {0.1762} & {0.0791} & {0.1983} & {0.0791} & {0.0791} & {0.1762} & { 0} & {0.1854} & {0.0791} \\{0.1854} & {0.1854} & {0.1854} & {0.1983} & {0.1854} & {0.1854} & {0.1854} & {0.1854} & { 0} & {0.1854} \\{0.0791} & {0.1762} & {0.0189} & {0.1983} & {0.0736} & {0.0506} & {0.1762} & {0.0791} & {0.1854} & { 0} \\\end{array} \right)$ |
根据上面的计算结果可知
步骤4. 按照从大到小的顺序从
$\begin{array}{l}{\text{例如当}}\lambda = 0.1983{\text{时}},\\{{\rm D}_{\lambda = 0.1983}} = \left( \begin{array}{*{20}{c}} {1} & {1} & {1} & {0} & {1} & {1} & {1} & {1} & {1} & {1} \\ {1} & {1} & {1} & {0} & {1} & {1} & {1} & {1} & {1} & {1} \\ {1} & {1} & {1} & {0} & {1} & {1} & {1} & {1} & {1} & {1} \\ {0} & {0} & {0} & {1} & {0} & {0} & {0} & {0} & {0} & {0} \\ {1} & {1} & {1} & {0} & {1} & {1} & {1} & {1} & {1} & {1} \\ {1} & {1} & {1} & {0} & {1} & {1} & {1} & {1} & {1} & {1} \\ {1} & {1} & {1} & {0} & {1} & {1} & {1} & {1} & {1} & {1} \\ {1} & {1} & {1} & {0} & {1} & {1} & {1} & {1} & {1} & {1} \\ {1} & {1} & {1} & {0} & {1} & {1} & {1} & {1} & {1} & {1} \\ {1} & {1} & {1} & {0} & {1} & {1} & {1} & {1} & {1} & {1} \\\end{array} \right)\end{array}$ |
当
${{\rm D}_{\lambda = 0.0486}} = \left( \begin{array}{*{20}{c}} {1} & {0} & {0} & {0} & {0} & {0} & {0} & {0} & {0} & {0} \\ {0} & {1} & {0} & {0} & {0} & {0} & {0} & {0} & {0} & {0} \\ {0} & {0} & {1} & {0} & {0} & {0} & {0} & {0} & {0} & {0} \\ {0} & {0} & {0} & {1} & {0} & {0} & {0} & {0} & {0} & {0} \\ {0} & {0} & {0} & {0} & {1} & {0} & {0} & {0} & {0} & {0} \\ {0} & {0} & {0} & {0} & {0} & {1} & {0} & {0} & {0} & {0} \\ {0} & {0} & {0} & {0} & {0} & {0} & {1} & {0} & {0} & {0} \\ {0} & {0} & {0} & {0} & {0} & {0} & {0} & {1} & {0} & {0} \\ {0} & {0} & {0} & {0} & {0} & {0} & {0} & {0} & {1} & {0} \\ {0} & {0} & {0} & {0} & {0} & {0} & {0} & {0} & {0} & {1} \\\end{array} \right)$ |
步骤5. 把
步骤6. 将上面得到的聚类结果作为k-means算法的初始集群, 做进一步聚类分析, 验证本文算法聚类结果的准确性. 因为分为10个类和1个类的结果都只有一种, 所以下面只对分为2–9个类的结果进行验证.
当
$ \{\{ {s_0}\} \{ {s_{ - 0.5}},{s_{0.25}},{s_0}\} \{ {s_0}\} \{ {{s_0},{s_{1.75}},{s_{2.5}}}\} \{ {s_0}\}\} $ |
计算每一个样本到类之间的距离测度:
由此可知,
由此可知,
由此可知,
由此可知,
由此可知,
由此可知,
由此可知,
由此可知,
由此可知,
采用k-means进行聚类的结果为:
同理可以得到当
模糊聚类逐渐成为新的研究热点, 许多模糊聚类算法已经被提出, 但是基于HFLTSs的模糊聚类算法尚未成熟, 存在计算复杂度高的缺陷, 而HFLTSs是比较流行而且灵活度很高的语言术语, 因此本文提出了计算复杂度相对较低的基于HFLTSs的正交模糊聚类算法. 该算法基于HFLTSs的距离测量矩阵采用正交思想, 确定无法划分为同个类别的样本, 得到聚类结果. 为了验证算法的准确性和高效性, 本文还通过一个实例结合k-means算法对本文算法进行了验证. 未来 , 我们将继续研究将该算法扩展延伸至可以应用于更多类型的语言术语, 例如概率语言术语集(PLTSs), 以及为了使该算法可以更好地应用于大数据做进一步的研究和努力.
[1] |
Wong CC, Lai HR. A grey-based clustering algorithm and its application on fuzzy system design. International Journal of Systems Science, 2003, 34(4): 269-281. DOI:10.1080/0020772031000158519 |
[2] |
Alzate C, Suykens JAK. Hierarchical kernel spectral clustering. Neural Networks, 2012, 35: 21-30. DOI:10.1016/j.neunet.2012.06.007 |
[3] |
赵纯, 高俊波. 基于区间直觉模糊的情感分类模型. 计算机系统应用, 2014, 23(10): 207-211. DOI:10.3969/j.issn.1003-3254.2014.10.037 |
[4] |
Zadeh LA. The concept of a linguistic variable and its application to approximate reasoning-II. Information Sciences, 1975, 8(4): 301-357. DOI:10.1016/0020-0255(75)90046-8 |
[5] |
Torra V. Hesitant fuzzy sets. International Journal of Intelligent Systems, 2010, 25(6): 529-539. |
[6] |
Rodríguez RM, MartíNez L, Herrera F. Hesitant fuzzy linguistic term sets for decision making. IEEE Transactions on Fuzzy Systems, 2012, 20(1): 109-119. DOI:10.1109/TFUZZ.2011.2170076 |
[7] |
卢志刚, 陈行娟. 基于信息熵的模糊多属性决策供应商选择方法. 计算机系统应用, 2012, 21(8): 170-173, 232. |
[8] |
Degani R, Bortolan G. The problem of linguistic approximation in clinical decision making. International Journal of Approximate Reasoning, 1988, 2(2): 143-162. DOI:10.1016/0888-613X(88)90105-3 |
[9] |
Herrera F, Martinez L. A 2-tuple fuzzy linguistic representation model for computing with words. IEEE Transactions on Fuzzy Systems, 2000, 8(6): 746-752. DOI:10.1109/91.890332 |
[10] |
Xu ZS, Wang H. On the syntax and semantics of virtual linguistic terms for information fusion in decision making. Information Fusion, 2017(34): 43-48. DOI:10.1016/j.inffus.2016.06.002 |
[11] |
廖虎昌, 缑迅杰, 徐泽水. 基于犹豫模糊语言集的决策理论与方法综述. 系统工程理论与实践, 2017, 37(1): 35-48. DOI:10.12011/1000-6788(2017)01-0035-14 |
[12] |
Beg I, Rashid T. TOPSIS for hesitant fuzzy linguistic term sets. International Journal of Intelligent Systems, 2013, 28(12): 1162–1171.
|
[13] |
张洪美, 徐泽水, 陈琦. 直觉模糊集的聚类方法研究. 控制与决策, 2007, 22(8): 882-888. |
[14] |
Xu ZS, Chen J, Wu JJ. Clustering algorithm for intuitionistic fuzzy sets. Information Sciences, 2008, 178(19): 3775-3790. DOI:10.1016/j.ins.2008.06.008 |
[15] |
Zhang XL, Xu ZS. An MST cluster analysis method under hesitant fuzzy environment. Control and Cybernetics, 2012, 41(3): 645-666. |
[16] |
Chen N, Xu ZS, Xia MM. Correlation coefficients of hesitant fuzzy sets and their applications to clustering analysis. Applied Mathematical Modelling, 2013, 37(4): 2197-2211. DOI:10.1016/j.apm.2012.04.031 |
[17] |
Chen N, Xu ZS, Xia MM. Hierarchical hesitant fuzzy K-means clustering algorithm. Applied Mathematics-A Journal of Chinese Universities, 2014, 29(1): 1-17. DOI:10.1007/s11766-014-3091-8 |
[18] |
Yang X, Xu ZS, Liao HC. Correlation coefficients of hesitant multiplicative sets and their applications in decision making and clustering analysis. Applied Soft Computing, 2017(61): 935-946. DOI:10.1016/j.asoc.2017.08.011 |
[19] |
Liu YM, Zhao H, Xu ZS. An orthogonal clustering method under hesitant fuzzy environment. International Journal of Computational Intelligence Systems, 2017, 10(1): 663-676. DOI:10.2991/ijcis.2017.10.1.44 |
[20] |
Wei C, Zhao N, Tang X. Operators and comparisons of hesitant fuzzy linguistic term sets. IEEE Transactions on Fuzzy Systems, 2014, 22(3): 575–585.
|
[21] |
Xu ZS, Wang H. On the syntax and semantics of virtual linguistic terms for information fusion in decision making. Information Fusion, 2017(34): 43-48. DOI:10.1016/j.inffus.2016.06.002 |
[22] |
蒋圆, 徐泽水, 舒轶昊. 直觉积性模糊集的距离测度及其在卫星地球站选址问题中应用. 系统工程理论与实践, 2016, 36(12): 3210-3219. DOI:10.12011/1000-6788(2016)12-3210-10 |
[23] |
Liao HC, Xu ZS. Approaches to manage hesitant fuzzy linguistic information based on the cosine distance and similarity measures for HFLTSs and their application in qualitative decision making. Expert Systems with Applications, 2015, 42(12): 5328-5336. DOI:10.1016/j.eswa.2015.02.017 |
[24] |
胡辉, 徐泽水. 基于TOPSIS的区间直觉模糊多属性决策法. 模糊系统与数学, 2007, 21(5): 108-112. |
[25] |
Li DQ, Zeng WY, Li JH. New distance and similarity measures on hesitant fuzzy sets and their applications in multiple criteria decision making. Engineering Applications of Artificial Intelligence, 2015(40): 11-16. DOI:10.1016/j.engappai.2014.12.012 |
[26] |
Tan QY, Wei CP, Liu Q, et al. The hesitant fuzzy linguistic TOPSIS method based on novel information measures. Asia-Pacific Journal of Operational Research, 2016, 33(5): 1650035. DOI:10.1142/S0217595916500354 |