Abstract:Qinghai Lake is China's largest inland lake, which plays a crucial role in the local ecosystem. To effectively monitor the Qinghai Lake water body has become a research direction. The current water body recognition research is mostly realized using single machine, this method has the problem of slow recognition and low degree of automation. With the increasing amount of remote sensing data, traditional identification methods cannot meet the demand. Based on Hadoop and Spark distributed big data framework, this study designs and implements an automatic water body recognition system. The system mainly realizes the data storage, data reading, data processing, model prediction, and other functional modules of remote sensing images, and finally implements the automated execution of the system through shell scripting. Finally, this study selects the three-day remote sensing image data of Qinghai Lake area to verify the system. The experimental results show that the system can automatically complete the water body recognition process and accurately predict the water body.