Channel Exchanging Networks for Multimodal and Multitask Dense Image Prediction

Wang, Yikai; Sun, Fuchun; Huang, Wenbing; He, Fengxiang; Tao, Dacheng

Computer Science > Computer Vision and Pattern Recognition

arXiv:2112.02252 (cs)

[Submitted on 4 Dec 2021 (v1), last revised 4 Oct 2022 (this version, v2)]

Title:Channel Exchanging Networks for Multimodal and Multitask Dense Image Prediction

Authors:Yikai Wang, Fuchun Sun, Wenbing Huang, Fengxiang He, Dacheng Tao

View PDF

Abstract:Multimodal fusion and multitask learning are two vital topics in machine learning. Despite the fruitful progress, existing methods for both problems are still brittle to the same challenge -- it remains dilemmatic to integrate the common information across modalities (resp. tasks) meanwhile preserving the specific patterns of each modality (resp. task). Besides, while they are actually closely related to each other, multimodal fusion and multitask learning are rarely explored within the same methodological framework before. In this paper, we propose Channel-Exchanging-Network (CEN) which is self-adaptive, parameter-free, and more importantly, applicable for multimodal and multitask dense image prediction. At its core, CEN adaptively exchanges channels between subnetworks of different modalities. Specifically, the channel exchanging process is self-guided by individual channel importance that is measured by the magnitude of Batch-Normalization (BN) scaling factor during training. For the application of dense image prediction, the validity of CEN is tested by four different scenarios: multimodal fusion, cycle multimodal fusion, multitask learning, and multimodal multitask learning. Extensive experiments on semantic segmentation via RGB-D data and image translation through multi-domain input verify the effectiveness of CEN compared to state-of-the-art methods. Detailed ablation studies have also been carried out, which demonstrate the advantage of each component we propose. Our code is available at this https URL.

Comments:	Accepted by TPAMI 2022. Code is available at this https URL. arXiv admin note: text overlap with arXiv:2011.05005
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2112.02252 [cs.CV]
	(or arXiv:2112.02252v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2112.02252

Submission history

From: Yikai Wang [view email]
[v1] Sat, 4 Dec 2021 05:47:54 UTC (6,904 KB)
[v2] Tue, 4 Oct 2022 05:50:45 UTC (8,324 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Channel Exchanging Networks for Multimodal and Multitask Dense Image Prediction

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Channel Exchanging Networks for Multimodal and Multitask Dense Image Prediction

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators