Abstract:As a real-time computing framework, Storm has provided an efficient, fast, and real-time processing ability for multi-source heterogeneous data processing. However, Storm's default scheduler uses a simple Round-robin method and unable to adjust assignments of its task according to cluster's dynamic load status. To solve this problem, this study proposes a load-balancing strategy based on performance-aware. It could calculate Performance-Aware Value (PAV) according to node's processing ability, then greedy scheduling to achieve load balancing, which assigns the amount of computation match with node current processing capacity to achieve load balancing. Compared with the default scheduling algorithm, the results show that this algorithm can effectively reduce the Storm processing delay and improve the throughput, finally achieve cluster's load balance.