目前关于集成学习的泛化性能的研究已取得很大成功, 但是关于集成学习的误差分析还需要进一步研究. 考虑交叉验证在统计机器学习中对于模型性能评估有重要应用, 为此, 应用组块3×2交叉验证和k折交叉验证方法为每个样本点进行赋予权重的预测值的集成, 并进行误差分析. 在模拟数据和真实数据上进行实验, 结果表明基于组块3×2交叉验证的集成学习预测误差小于单个学习器的预测误差, 并且集成学习的方差比单个学习器方差小. 与基于k折交叉验证的集成学习方法相比, 基于组块3×2交叉验证的泛化误差小于基于k折交叉验证的泛化误差, 说明基于组块3×2交叉验证的集成学习模型稳定性好.
While ensemble learning has achieved remarkable success in generalization performance, the error analysis of ensemble learning needs further research. As cross-validation has an important application for model performance evaluation in statistical machine learning, block-3×2 cross-validation and k-fold cross-validation are applied to integrate the weighted prediction values for each sample point and analyze the error. Experiments on simulated data and real data show that the prediction error of ensemble learning based on block-3×2 cross-validation is smaller than that of a single learner, and the variance of ensemble learning is smaller than that of a single learner. The generalization error of the ensemble learning based on block-3×2 cross-validation is less than that of the one based on k-fold cross-validation, which indicates that the ensemble learning model based on block-3×2 cross-validation has good stability.