Diff-Plugin: Revitalizing Details for Diffusion-based Low-level Tasks

CVPR 2024

1City University of Hong Kong, 2Adobe Research

Real-world applications of Diff-Plugin visualized across distinct single-type and one multi-type low-level vision tasks. Diff-Plugin allows users to selectively conduct interested low-level vision tasks via natural languages and can generate high-fidelity results.

Abstract

Diffusion models have demonstrated impressive capabilities in image generation and have been effectively adapted for image restoration tasks. However, despite their success, diffusion models often struggle to generate images with sufficient detail, particularly in complex scenarios. This limitation becomes more pronounced in image restoration, where the goal is to recover fine details from degraded inputs. To address this challenge, we propose Diff-Plugin, a training-free method that leverages the generation capability of pre-trained diffusion models for low-level tasks while incorporating a plugin module to enhance detail generation. Our approach utilizes a lightweight plugin trained specifically on pairs of low-quality and high-quality images to refine the outputs of diffusion models. The plugin is designed to revitalize the details that diffusion models may overlook, thereby improving the overall quality of restored images. Extensive experiments on various image restoration tasks, including denoising, deblurring, super-resolution, and deraining, demonstrate the effectiveness of our method in enhancing detail recovery while maintaining the advantageous properties of diffusion models.

Method Overview

Overview of our Diff-Plugin framework. We leverage pre-trained diffusion models and introduce a lightweight plugin network to enhance detail generation for low-level vision tasks.

Results

Denoising

Comparison results on image denoising tasks. Our method achieves superior detail recovery compared to existing approaches.

Face Restoration

Face restoration results showing improved detail preservation and natural-looking outputs.

Interactive Demo

Our interactive Gradio demo allows users to test different low-level vision tasks with natural language instructions.

Demo Video

Demonstration video showing the capabilities of Diff-Plugin across various low-level vision tasks.

BibTeX

@inproceedings{liu2024diffplugin,
  title={Diff-Plugin: Revitalizing Details for Diffusion-based Low-level Tasks},
  author={Liu, Yuhao and Ke, Zhanghan and Liu, Fang and Zhao, Nanxuan and Lau, Rynson W.H.},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2024}
}