Abstract:The existing research on battery state of charge (SOC) prediction based on neural networks mostly focuses on the optimization of model structure and related parameters, ignoring the important role of training data. A battery SOC prediction method based on feature selection and data augmentation is proposed to overcome this problem. Specifically, feature engineering is carried out according to the original battery charge and discharge data, and seven features that are most helpful to model prediction are selected by the permutation importance (PI) method; then, Gaussian noise is added to expand the total number of training data samples and thereby achieve the purpose of data augmentation. In the experiment, a bidirectional long short-term memory (Bi-LSTM) network is used as the prediction model, and the Panasonic 18650PF dataset is adopted as the training data. When the standard Bi-LSTM model is employed for prediction, the mean absolute error (MAE) and the maximum error (MaxE) are 0.65% and 3.92% respectively. After feature selection and data augmentation, the MAE and MaxE of model prediction are 0.47% and 2.62% respectively, indicating that the accuracy of the battery SOC prediction model can be further improved by PI feature engineering and the Gaussian data augmentation method.