Abstract:Web text categorization can be applied to many domains such as information retrieval, news categorization, etc. Decision tree algorithm is a simple method for categorization and has been used extensively. This paper investigates the basic method and process to build a web classifier by means of C4.5 decision tree, which has various merits such as high categorization precision, high categorization speed, etc. Moreover, this paper proposes a C4.5 decision tree based frame of web pages classifier, and implements it on a web crawler. The experimental results show that this algorithm is highly effective.