Abstract:The paper mainly analyses the structural characteristic of the sentences that describe enterprise basic information on enterprise homepage. It proposes the method and technique of enterprise homepage information extraction based on regular expression, and develops an archetype system to extract some enterprise information items. The experimental results show that it can extract enterprise-related information from enterprise homepage effectively and get a high recall and precision.