Graduate University, Chinese Academy of Sciences, Beijing 100049, China;Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang 110168, China 在期刊界中查找 在百度中查找 在本站中查找
This paper researched key techniques of topic-focused web crawler at first, then designed and implemented a crawler system by using improved slef-adapted vector space model. It analysised documents both in text and links. As the same time, this paper also comed up with a web search stategy based on gene factor combined with manully control. This strategy can solve the problem of searching path blocked. In the end, we provide some experiment results to prove the feasibility and advantages of our system from recall ratio and precision ratio.