Abstract:Keyword extraction technology is the foundation of corpus construction, text analysis, and information retrieval. The traditional TFIDF algorithm is mainly based on word frequency weighting to extract keywords without considering the influence of text features. The excessive reliance on word frequency leads to the inaccuracy of extract keywords. To solve this problem, an improved algorithm has been proposed, which use the word position and the word information as factors to recalculate the weight, then we implement it in Python. Experiment shows that using this method to extract keywords can improve the recall rate, accuracy, and F-measure.