Abstract:The application of artificial intelligent has been stimulating the productivity and technological revolution of industries. Traditional industries are facing small sample and imbalanced data problems due to the rarity nature of sample, cost and privacy issues. However, the sample generation results of existing methods are often limited to balancing generalization and validity. The purposed semantic meaning extraction of VAE’s latent variables based virtual sample generation method utilized the weights of encoder neural network as the measurement of dependency between input features and the latent variables. This method achieves flexible sample generation by controlling various dimensions of latent variables explicitly. The generated samples which satisfy the population distribution, are not necessarily included in the original samples. The results of sample expansion of civil buildings structural safety databases show that our method is capable of controllable generation of valid samples, and mitigating the problems of small sample and imbalanced data.