Abstract:As one of the six working languages of the United Nations and a major mother tongue second only to Chinese, Spanish has complex morphological changes and grammatical rules. These result in the inability of classic term extraction methods such as C-value and thus affect the effect of Spanish text analysis. This study proposes a Spanish term extraction method to automatically construct a complete lexicon for text modeling. Given a Spanish text or corpus, the method extracts terms in three steps: preprocessing the texts, extracting candidate terms, and calculating term-hood indexes of the candidate terms based on DC-value. The set of candidate terms obtained in the first two steps can be used directly as the lexicon for text mining. Meanwhile, the term-hood indexes obtained in the third step are essential for reducing the manual workload in determining whether the candidates are really terms. According to experiments, the proposed method has a high accuracy of 80% and a recall much higher than that of classic methods, providing the effective lexicon for Spanish text mining.