Abstract:Long-term object tracking remains a formidable challenge compared to short-term object tracking. However, existing long-term tracking algorithms often perform poorly when faced with challenges such as targets frequently appearing and disappearing, and drastic changes in target appearance. This study proposes a novel, robust, and real-time long-term tracking framework based on local search modules and global search tracking modules. The local search module utilizes the TransT short-term tracker to generate a series of candidate boxes, and the best candidate box is determined through confidence scoring. A novel global search tracking module is developed for global re-detection, based on the Faster R-CNN model, with the introduction of Non-Local operations and multi-level instance feature fusion modules in the RPN and R-CNN stages, aiming to fully exploit target instance-level features. To improve the performance of the global search tracking module, a dual-template update strategy is designed to enhance the robustness of the tracker. By utilizing templates updated at different time points, the tracker can better adapt to target changes. The target presence is determined based on local or global confidence scores, and the local or global search tracking strategy is selected in the next frame. Additionally, the local search module is capable of estimating the position and size of the target. Moreover, a ranking loss function is introduced for the global search tracker, implicitly learning the similarity between region proposals and the original query target. A large number of experiments are conducted on multiple tracking datasets to comprehensively assess the proposed tracking framework. The results consistently demonstrate that the proposed tracking framework achieves satisfactory performance.