Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration

Dudhane, Akshay; Thawakar, Omkar; Zamir, Syed Waqas; Khan, Salman; Khan, Fahad Shahbaz; Yang, Ming-Hsuan

Computer Science > Computer Vision and Pattern Recognition

arXiv:2404.02154 (cs)

[Submitted on 2 Apr 2024 (v1), last revised 13 Oct 2024 (this version, v2)]

Title:Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration

Authors:Akshay Dudhane, Omkar Thawakar, Syed Waqas Zamir, Salman Khan, Fahad Shahbaz Khan, Ming-Hsuan Yang

View PDF HTML (experimental)

Abstract:All-in-one image restoration tackles different types of degradations with a unified model instead of having task-specific, non-generic models for each degradation. The requirement to tackle multiple degradations using the same model can lead to high-complexity designs with fixed configuration that lack the adaptability to more efficient alternatives. We propose DyNet, a dynamic family of networks designed in an encoder-decoder style for all-in-one image restoration tasks. Our DyNet can seamlessly switch between its bulkier and lightweight variants, thereby offering flexibility for efficient model deployment with a single round of training. This seamless switching is enabled by our weights-sharing mechanism, forming the core of our architecture and facilitating the reuse of initialized module weights. Further, to establish robust weights initialization, we introduce a dynamic pre-training strategy that trains variants of the proposed DyNet concurrently, thereby achieving a 50% reduction in GPU hours. Our dynamic pre-training strategy eliminates the need for maintaining separate checkpoints for each variant, as all models share a common set of checkpoints, varying only in model depth. This efficient strategy significantly reduces storage overhead and enhances adaptability. To tackle the unavailability of large-scale dataset required in pre-training, we curate a high-quality, high-resolution image dataset named Million-IRD, having 2M image samples. We validate our DyNet for image denoising, deraining, and dehazing in all-in-one setting, achieving state-of-the-art results with 31.34\% reduction in GFlops and a 56.75\% reduction in parameters compared to baseline models. The source codes and trained models are available at this https URL.

Comments:	This version includes updates where the DyNet variants now share the same weights during inference as well, eliminating the need to store separate weights and thereby reducing device storage requirements. Additionally, all results have been updated based on the new experimental setup
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2404.02154 [cs.CV]
	(or arXiv:2404.02154v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2404.02154

Submission history

From: Akshay Dudhane [view email]
[v1] Tue, 2 Apr 2024 17:58:49 UTC (3,839 KB)
[v2] Sun, 13 Oct 2024 14:26:56 UTC (9,814 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators