Abstract:This paper analyzes the “distance method” that is an algorithm of language features and points out the questions of “distance method”: “Breakpoint” that be used cut into two sets will affect the accuracy of calculation. It designs a new algorithm named “clustering method” with the idea of clustering analysis. The defects of two methods can complement each other and should be used together in practical application.