Abstract:To make up the gap of short answer grading systems in multilingual teaching, this paper proposes an automatic short answer grading system based on siamese network and bidirectional encoder representations from transformers (BERT) model. First, the question and answer texts of short answers yield sentence vectors of texts with the natural language-preprocessed BERT model. The BERT model has been trained on a large-scale multilingual corpus, and the obtained text vectors contain rich contextual semantic information and can deal with multilingual information. Then, the sentence vectors of question and answer texts are subjected to the calculation of the semantic similarity in the siamese network of a deep network. Finally, a logistic regression classifier is employed to complete automatic short answer grading. The datasets used for automatic short answer grading tasks are provided by the Hewlett Foundation, and the quadratic weighted kappa coefficient is used as the evaluation index of the model. The experimental results show that the proposed method outperforms other baseline models for automatic short answer grading in each data subset.