Abstract:Text similarity check is mainly used in Re-check detection of Papers, the deduplication of search engines and other fields. However, it's extremely fussy to extract feature items with the traditional methods for computing the text similarity. In addition, it will bring uncertainty to select elements randomly. To solve these problems, a text similarity method based on improved Jaccard coefficient is proposed. This method takes into account the weights of elements and samples in the document, even the contribution degree to multiple text similarity. The results suggest that the text similarity method based on the improved Jaccard coefficient has been proved to be effective with a satisfactory accuracy, which can be applicable to various lengths of Chinese, English documents. It effectively solves the problem of inexact computing with existing technologies.