Abstract:Data plays an extremely important role in research and development in fields such as machine learning and artificial intelligence. However, some real-world factors prevent data consumers from obtaining real datasets that meet their work requirements, such as privacy issues, data scarcity, and poor data quality. In response to this situation, this study develops a non-normal data synthesis algorithm (KMSI) as an improvement to the sampling-iteration (SI) technique. This algorithm utilizes a mixed-type correlation coefficient matrix to reduce measurement errors in various steps of the SI technique, including target setting and control loops. It replaces Bootstrap sampling with kernel density estimation sampling to avoid using real data. Experimental results show that, compared to the SI technique, KMSI is capable of handling complex and mixed-type datasets and does not include real data in the synthetic results. Furthermore, compared to other enhancement methods, KMSI offers users more customization options for the sample size in synthetic datasets.