Abstract:In this paper, it analyzes the mechanism of Hadoop and summarizes the common issue of memory leak, and proposes a method to diagnose this issue. The proposed approach could diagnose the phase of the overflow of memory occurs, the objects which consume most of the memory space, and the related configurations, to help the Hadoop user to find the root cause of error during out of memory. It also evaluates the effectiveness of the proposed approach under typical data processing applications for the power grid.