以海量半结构化的气象数值预报数据产品为研究对象, 针对传统数据抽取方法效率不高的问题, 基于多进程处理技术, 设计了一种基于精准位置寻址的快速数据块定位算法, 实现了数据块的精准定位; 设计了可按需在空间范围内进行裁剪的截取算法, 可按需根据数据的属性维度、经纬度范围等信息实现数据按需抽取; 基于上述算法实现了全流程统一控制的多进程数据读取的业务流程. 并以单平面耗时为主要考核指标, 分别采用1进程, 4进程、8进程以及16进程进行数据处理, 实际测试结果表明, 采用16进程处理比单个进程处理的速度由257 ms提高到37 ms. 该方法有效的提升非结构气象数值预报产品数据的抽取效率, 已在面向城市治理等气象决策分析业务中业务化应用.
Traditional data extraction methods are usually inefficient. To address this problem, we first design an exact position addressing-based algorithm with multi-processing methods to achieve the accurate positioning of data blocks by taking the massive data generated from semi-structured numerical weather prediction (NWP) products as the research object. Then, an extraction algorithm is designed to extract data in the spatial range on demand, namely, to realize on-demand data extraction according to attribute dimensions as well as the latitude and longitude of data. As a result, the multi-process data reading under unified whole-process control is achieved on the basis of the above two algorithms. For testing, the time consumption of a single data plane is taken as the main assessment index, and the single-, quad-, octo-, and 16-core processes are employed for data processing. The test results reveal that the processing with 16-core processes is faster than that of a single-core process, and the time consumption is reduced from 257 ms to 37 ms. This method can effectively improve the efficiency of data extraction for non-structural NWP products and has been put into use in decision analysis for urban governance.