Abstract:This study proposes a multi-modal deep-level high-confidence fusion tracking algorithm in response to the tracking failure issues caused by changes in target appearance and environment in single-target tracking applications. First, a high-dimensional multi-modal model is constructed utilizing the target’s color model combined with a shape model based on bilinear interpolation HOG features. Then, candidate targets are searched using particle filtering. The challenge posed by model fusion is addressed by scrupulously quantifying a range of confidences in shape and color models. This is followed by the introduction of a high-confidence fusion criterion, which enables a deeply-adaptive, weighted, and balanced fusion with different confidence levels in the multi-modal model. To counter the issue of static model update parameters, a nonlinear, graded balanced update strategy is designed. Upon testing on the OTB-2015 dataset, this algorithm’s average CLE and OS metrics demonstrated superior performance compared to all reference algorithms, with values of 30.57 and 0.609, respectively. Moreover, with an FPS of 15.67, the algorithm fulfills the real-time operation requirements inherent in tracking algorithms under most conditions. Notably, in some common specific scenarios, the accuracy and success rate of the algorithm also outperform the top-tier algorithms in most cases.