An Enhanced Odds Ratio Dualistic Feature Extraction Method
DOI:
CSTR:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    An important issue in topical crawler research is feature extraction, which makes great impact on topic description and page relevance scoring. The existing Odds Ratio method shows high performance on high dimension vectors, whereas it does not work well on low dimension condition. An enhanced method EOR based on Odds Ratio method, with word frequency and distribution rate taken into account, is proposed. The simulation shows a 5% increase on text categorization precision on low and middle feature dimension. Furthermore, by combining EOR score and TF value, namely, TF-EOR to calculate word weight and applying it to topical crawler, 4% increases on both precision and recall are obtained.

    Reference
    Related
    Cited by
Get Citation

杜一平,刘燕君.基于优势率的改进二元特征提取方法.计算机系统应用,2010,19(2):106-109

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:May 18,2009
  • Revised:
  • Adopted:
  • Online:
  • Published:
Article QR Code
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063