﻿ 基于二分网中心节点识别的产品评论特征-观点词对提取研究
Research on Product Feature-Opinion Extraction Based on Center Node Recognition in Bipartite Network
LIU Chen, JI Li, TANG Li
Business School, University of Shanghai for Science and Technology, Shanghai 200093, China
Foundation item: National Natural Science Foundation of China (71401107)
Abstract: This study takes the product review texts on the e-commerce platform as the mining object, and focuses on the identification of feature words and opinion words in reviews. First, we build bipartite network with feature–opinion words, and give the sorting algorithm of node importance in this network. At last, the algorithm is applied to the actual review text data to verify the effectiveness of the algorithm.
Key words: feature-opinion extraction     bipartite network     center node recognition     product review

1 引言

2 特征-观点对二分网络的构建 2.1 特征-观点对二分网络的表示

 图 1 特征-观点对二分网络图 Fig. 1 Bipartite network with feature-opinion words

2.2 特征-观点对二分网络中的度和点权

 ${k_i} = \sum\limits_{j = 1}^N {{a_{ij}}} = \sum\limits_{j = 1}^N {{a_{ji}}}$ (1)

 $A = {({a_{ij}})_{N \times N}}$ (2)

 ${S_i} = \sum\limits_{j \in {N_i}} {{w_{ij}}}$ (3)

3 特征-观点对提取

3.1 B-核分解算法

CFO: 候选特征观点词集.

B: 无权特征-观点对二分网络. $i$ 表示网络中的节点.

Ranking set: 新特征观点词排序集.

Step 1: Input: CFO

Step 2: 构建网络B

Step 3: For iin B:

E is empty set

$\scriptstyle{b_{\min }} = \min \_\deg {\rm {ree}}(B)$

If $\scriptstyle i$ is feature:

If $\scriptstyle i.\deg {\rm {ree}} \leqslant {b_{\min }}$ :

$\scriptstyle i$ is inserted into E

If $\scriptstyle i$ is opinion:

If $\scriptstyle i.\deg {\rm {ree}} \leqslant {b_{\min }}$ :

$\scriptstyle i$ is inserted into E

E is inserted into Ranking set

E is deleted

Update B

Every node are recalculated

Step 4: Output: Ranking set

 图 2 无权特征-观点对二分网络图 Fig. 2 Unweighted bipartite network with feature-opinion words

 图 3 节点重要性排序图 Fig. 3 Nodes importance sorting

3.2 BW-核分解算法

CFO: 候选特征观点词集.

B: 加权特征-观点对二分网络. $i$ 表示网络中的节点.

Ranking set: 新特征观点词排序集.

Step 1: Input: CFO

Step 2: 构建网络B

Step 3: For i in B:

E is empty set

$\scriptstyle b{w_{\min }} = \min \_weight(B)$

$\scriptstyle a \geqslant b{w_{\min }}$

If $\scriptstyle i$ is feature:

If $\scriptstyle i.weight \leqslant b{w_{\min }}$ :

$i$ is inserted into E

If $\scriptstyle i$ is opinion:

If $\scriptstyle i.weight \leqslant b{w_{\min }}$ :

$\scriptstyle i$ is inserted into E

E is inserted into Ranking set

E is deleted

Update B

Every node weights are

recalculated

Step 4: Output: Ranking set

 图 4 加权特征-观点对二分网络图 Fig. 4 Weighted bipartite network with feature-opinion words

 图 5 节点重要性排序图 Fig. 5 Nodes importance sorting

4 实验

4.1 实验数据集

 图 6 句法分析结果 Fig. 6 Syntactic analysis result

4.3 实验结果

 $P= x/(x + y)$ (4)
 $R = x/(x + z)$ (5)
 $F = (2 \times R \times P)/(R + P)$ (6)

 图 7 特征节点度分布 Fig. 7 Degree distribution of feature nodes

 图 8 观点词节点度分布 Fig. 8 Degree distribution of opinion nodes

 图 9 无权二分网络P、R、F值分布 Fig. 9 Value distribution of P、R、F in Unweighted bipartite network

 图 10 加权二分网络P、R、F值分布 Fig. 10 Value distribution of P、R、F in Weighted bipartite network

5 结论

