Abstract
RGBT (visible and thermal imaging) tracking offers a robust solution for all-weather target tracking by integrating RGB and thermal imaging data. However, traditional fusion methods often struggle in complex scenes with varying conditions. In this paper, we propose a Visual State-Space Module that leverages Mamba's linear complexity long-range modeling capabilities to significantly enhance the robustness of feature extraction. Our method introduces an innovative Multi-Scale Fusion Mechanism that improves the efficiency and accuracy of feature fusion in RGBT tracking. This mechanism captures multi-scale feature information more effectively by generating comprehensive feature maps through the summation of various convolution results, thereby enhancing the model's overall feature representation and discriminative capabilities. We conducted extensive experiments on five publicly available datasets to assess the performance of our method. Experimental results show that our method has certain advantages over existing methods, especially in challenging scenes with background clutter and illumination variations, resulting in more stable and reliable target tracking. It provides a more efficient and robust solution for complex tracking tasks under different environmental conditions.
Full text links
Read article at publisher's site: https://doi.org/10.21203/rs.3.rs-5359152/v1
Read article for free, from open access legal sources, via Unpaywall: https://www.researchsquare.com/article/rs-5359152/latest.pdf