Practice of Improving Join Query Efficiency for Large Scale Spatiotemporal Trajectory Data
CSTR:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    This study proposes an algorithm named DPCP-CROSS-JOIN for fast co-spatiotemporal relationship join queries of large-scale trajectory data in insufficient cluster computing resource environments. The proposed algorithm discretizes continuous trajectory data by segmenting and cross-coding the temporal fields of trajectory data and conducting spatiality gridded coding and then stores the data in two-level partitions using date and grid region coding. It achieves 3-level indexing and 4-level acceleration for spatiotemporal join queries through cross “equivalent” join queries. As a result, the time complexity of the co-spatiotemporal relationship join queries among n$\cdot $n objects is reduced from O(n2) to O(nlogn). It can improve the efficiency of join queries by up to 30.66 times when Hive and TEZ are used on a Hadoop cluster for join queries of large-scale trajectory data. This algorithm uses time-slice and gridding coding as the join condition, thereby cleverly bypassing the real-time calculation of complex expressions during the join process. Moreover, complex expression calculation join is replaced with “equivalent” join to improve the parallelism of MapReduce tasks and enhance the utilization rates of cluster storage and computing resources. Similar tasks of larger scales of trajectory data that are almost impossible to accomplish using general optimization methods can still be completed by the proposed algorithm within a few minutes. The experimental results suggest that the proposed algorithm is efficient and stable, and it is especially suitable for the co-spatiotemporal relationship join queries of large-scale trajectory data under insufficient computing resource conditions. It can also be used as an atomic algorithm for searching accompanying spatiotemporal trajectories and determining the intimacy of relationships among objects. It can be widely applied in fields such as national security and social order maintenance, crime prevention and combat, and urban and rural planning support.

    Reference
    Related
    Cited by
Get Citation

丁强龙,叶惠珠,袁弘强,李志新.大规模时空轨迹数据连接查询效率优化实践.计算机系统应用,2024,33(5):1-14

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:November 15,2023
  • Revised:December 20,2023
  • Adopted:
  • Online: April 07,2024
  • Published:
Article QR Code
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063