Abstract:3D human pose estimation using depth sensor is an important research topic in computer vision, and it is useful for applications in human-computer interaction, virtual reality, and design of animation etc. The most successful methods toward this problem are bottom-up methods which predict 3d poses using classification, regression or retrieval techniques. These methods are widely applied in human-computer interactions. However, these methods rely on a huge human pose database and the predictions are rather inaccurate. In this paper, we propose to estimate 3D human pose using personalized 3D human models and monocular depth images. We firstly reconstruct a 3D virtual human model for each subject, and in the pose estimation phase, we reconstruct incomplete mesh from depth data, and estimate the correspondences between points of the 3d human model and the incomplete mesh. We estimate the optimal 3D poses through iterative optimization of objective function. In comparison with bottom-up methods, our method is free of any pre-captured dataset. Our experiments verifies that our results are more accurate than those of other methods.