Abstract:As big data and artificial intelligence technologies are booming, high-performance, real-time streaming computing systems are gradually replacing traditional batch computing systems based on data warehouses. As an open-source distributed big-data streaming computing platform that is highly fault-tolerant and can realize real-time processing, Apache storm supports a variety of task distribution schemes such as average task distribution strategy and single-machine task assignment strategy. When there are multiple tasks in the task topology and only certain machines in the cluster support the execution of a certain task, the traditional task scheduling method can only allocate a single task to a single designated machine, failing to make best use of resources in the entire cluster. By the adjustment to the task scheduling strategy, the eligible machine queue is obtained. Then, the assigned tasks are evenly distributed to available work nodes in the machine queue, and other tasks are distributed to the remaining machines in the cluster through the default strategy. In this way, multi-task group scheduling strategy can be achieved.