计算机系统应用  2001, Vol. 29 Issue (9): 32-39 PDF

1. 中国科学院 计算机网络信息中心, 北京 100190;
2. 中国科学院大学, 北京 100049;
3. 国家广播电视总局广播电视科学研究院, 北京 100866;
4. 北京市科学技术研究院, 北京 100089;
5. 北京市新技术应用研究所, 北京 100094

Correlation Analysis and Vectorization Method for Spatial Position
ZHANG Shu1,2, GUO Dan-Huai1,2, ZHOU Chun-Bao1,2, LI Xun-Chun3, JIN Wei4,5
1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China;
2. University of Chinese Academy of Sciences, Beijing 100049, China;
4. Beijing Academy of Science and Technology, Beijing 100089, China;
5. Beijing Institute of New Technology Application, Beijing 100094, China
Foundation item: National Natural Science Foundation of China (41971366, 91846301); National Key Research and Development Program of China (2018YFC0809700); Natural Science Foundation of Beijing Municipality (9172023, 9194027)
Abstract: Understanding the spatial correlation of places plays an important role in geographic information retrieval and recommendation systems, urban traffic management, and resident travel pattern exploration. In order to represent the places and their spatial relationships specifically, we propose a deep learning-based vectorization method for places. The correlation between places can be calculated by the place vectors. Firstly, the trajectories of long-distance and short-distance are matched and connected to build a large-scale traffic network, which could cover multiple travel modes and obtain a complete cognition of spatial relations. Then we propose a spatial vectorization method which is based on graph neural network and combines place features and trajectory information. Besides, we improve the representation ability of latent representations for places by optimizing a node sampling method. Finally, the empirical analysis is performed on the shared bicycle track data and public traffic data in Beijing. The result demonstrates that the proposed method outperforms the existing methods such as DeepMove on place correlation analysis and cluster analysis.
Key words: spatial correlation analysis     spatial representation     graph neural network     trajectory data

1 多源交通轨迹数据网络构建 1.1 多源交通轨迹数据

 图 1 不同时段内共享单车出行量分布图

 图 2 骑行轨迹距离分布图

1.2 集成不同类型轨迹数据的网络建模

 图 3 长、短距离出行轨迹匹配示例

2 融合POI与轨迹信息的空间向量化方法

2.1 形式化定义

2.2 POI与轨迹信息融合建模

$\scriptstyle h_v^0 \leftarrow \;{x_v}\_,\forall v \in V$

$\scriptstyle {\rm for}\;k = 1 \cdots K\;{\rm do}:$

$\scriptstyle {\rm for}\;v \in V\;{\rm do}:$

$\scriptstyle h_{\left( {S\left( v \right)} \right)}^k \leftarrow Agg\left( {h_i^{\left( {k - 1} \right)},\;\forall i \in S\left( v \right)} \right)$

$\scriptstyle o_v^k = CONCAT\left( {h_v^{\left( {k - 1} \right)},h_{S\left( v \right)}^k} \right)$

$\scriptstyle h_v^k \leftarrow f\left( {{W^k}\cdot\;o_v^k} \right)$

end

$\scriptstyle h_v^k\; \leftarrow \;\frac{{h_v^k}}{{||h_v^k|{|_2}}},\forall v \in V$

end

$\scriptstyle {\varphi _v} = \;h_v^K,\;\forall v \in V$

2.3 空间向量化表示学习方法

 图 4 交通网络中采样与聚合操作示意图

 ${J_g}({{\textit{z}}_v}) = - \log (f({\textit{z}}_v^{\rm T}{{\textit{z}}_i})) - Q{E_{{i_n}\sim{p_n}(i)}}\log (f({\textit{z}}_v^{\rm T}{{\textit{z}}_{{i_n}}}))$ (1)

 ${p_i} = \frac{{\displaystyle\sum\nolimits_{j = 1}^n {{w_{ij}}{\varphi _j}} }}{{\displaystyle\sum\nolimits_{j = 1}^n {{w_{ij}}} }}$ (2)

wij为位置向量 ${\varphi _j}$ pi中的权重. 当使用平均权重时, wij = 1. 其中, 本文使用tf-idf统计方法计算权重wij[14], 用以度量位置与POI的相关程度.

3 实验分析

3.1 评价指标

 ${S_{\rm splace}}({v_i},{v_j}) = \cos ({v_i},{v_j}) = \frac{{{v_i} \cdot {v_j}}}{{||{v_i}|| \cdot ||{v_j}||}}$ (3)

 ${d_{\rm place}}({v_i},{v_j}) = 1 - \;S({v_i},{v_j})$ (4)

 $s(i) = \frac{{b(i) - a(i)}}{{\max (a(i),b(i))}}$ (5)

3.2 实验结果与分析 3.2.1 空间位置关联分析

 图 5 分别使用本文提出的向量化表示方法、Node2Vec与DeepMove方法生成POI向量的相关性热力图

3.2.2 空间位置聚类分析

 图 6 聚类分析结果

4 结论与展望

 [1] Liu ZD, Li ZJ, Li M, et al. Mining road network correlation for traffic estimation via compressive sensing. IEEE Transactions on Intelligent Transportation Systems, 2016, 17(7): 1880-1893. DOI:10.1109/TITS.2016.2514519 [2] Yan B, Janowicz K, Mai GC, et al. From ITDL to Place2Vec: Reasoning about place type similarity and relatedness by learning embeddings from augmented spatial contexts. Proceedings of the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. New York, NY, USA. 2017. 1–10. [3] Jin JQ, Xiao ZJ, Qiu Q, et al. A GeoHash based Place2Vec model. IGARSS 2019–2019 IEEE International Geoscience and Remote Sensing Symposium. Yokohama, Japan. 2019. 3344–3347. [4] Ying R, He RN, Chen KF, et al. Graph convolutional neural networks for web-scale recommender systems. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York, NY, USA. 2018. 974–983. [5] Cai HY, Zheng VW, Chang KCC. A comprehensive survey of graph embedding: Problems, techniques, and applications. IEEE Transactions on Knowledge and Data Engineering, 2018, 30(9): 1616-1637. DOI:10.1109/TKDE.2018.2807452 [6] Goyal P, Ferrara E. Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems, 2018, 151: 78-94. DOI:10.1016/j.knosys.2018.03.022 [7] Chai D, Wang LY, Yang Q. Bike flow prediction with multi-graph convolutional networks. Proceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. New York, NY, USA. 2018. 397–400. [8] Diao ZL, Wang X, Zhang DF, et al. Dynamic spatial-temporal graph convolutional neural networks for traffic forecasting. Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu, HI, USA. 2019. 890–897. [9] Wang YD, Yin HZ, Chen HX, et al. Origin-destination matrix prediction via graph convolution: A new perspective of passenger demand modeling. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York, NY, USA. 2019. 1227–1235. [10] Zhou Y, Huang Y. DeepMove: Learning place representations through large scale movement data. Proceedings of 2018 IEEE International Conference on Big Data. Seattle, WA, USA. 2018. [11] Balkić Z, Šoštarić D, Horvat G. GeoHash and UUID identifier for multi-agent systems. Proceedings of the 6th KES international conference on Agent and Multi-Agent Systems. Berlin, Germany. 2012. 290–298. [12] Hamilton W, Ying ZT, Leskovec J. Inductive representation learning on large graphs. Advances in Neural Information Processing Systems. Long Beach, CA, USA. 2017. 1024–1034. [13] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality. Proceedings of the 26th International Conference on Neural Information Processing Systems. Red Hook, NY, USA. 2013. 3111–3119. [14] Ramos J. Using TF-IDF to determine word relevance in document queries. Proceedings of the First Instructional Conference on Machine Learning. Honolulu, HI, USA. 2003. 890–897. [15] Pennington J, Socher R, Manning C. Glove: Global vectors for word representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar. 2014. 1532–1543. [16] Rousseeuw PJ. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 1987, 20: 53-65. DOI:10.1016/0377-0427(87)90125-7 [17] Grover A, Leskovec J. Node2Vec: Scalable feature learning for networks. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA. 2016. 855–864. [18] 薛冰, 李京忠, 肖骁, 等. 基于兴趣点(POI)大数据的人地关系研究综述: 理论、方法与应用. 地理与地理信息科学, 2019, 35(6): 51-60. DOI:10.3969/j.issn.1672-0504.2019.06.009 [19] Van Der Maaten L, Hinton G. Visualizing data using t-SNE. Journal of Machine Learning Research, 2008, 9(11): 2579-2605.