Iterative Predictor-Critic Code Decoding for Real-World Image Dehazing

Jiayi Fu1, Siyu Liu1, Zikun Liu3, Chun-Le Guo1,2, Hyunhee Park4, Ruiqi Wu1, Guoqing Wang5, Chongyi Li*,1,2
1VPIC,CS, Nankai University, 2NKIARI, Shenzhen Futian, 3Samsung R&D Institute China-Beijing, 4CIG, Samsung Electronics, 5Donghai Laboratory, Zhoushan, Zhejiang
*Indicates Corresponding author

CVPR2025

Abstract

We propose a novel Iterative Predictor-Critic Code Decoding framework for real-world image dehazing, abbreviated as IPC-Dehaze, which leverages the high-quality codebook prior encapsulated in a pre-trained VQGAN. Apart from previous codebook-based methods that rely on one-shot decoding, our method utilizes high-quality codes obtained in the previous iteration to guide the prediction of the Code-Predictor in the subsequent iteration, improving code prediction accuracy and ensuring stable dehazing performance. Our idea stems from the observations that 1) the degradation of hazy images varies with haze density and scene depth, and 2) clear regions play crucial cues in restoring dense haze regions. However, it is nontrivial to progressively refine the obtained codes in subsequent iterations, owing to the difficulty in determining which codes should be retained or replaced at each iteration. Another key insight of our study is to propose Code-Critic to capture interrelations among codes. The Code-Critic is used to evaluate code correlations and then resample a set of codes with the highest mask scores, i.e., a higher score indicates that the code is more likely to be rejected, which helps retain more accurate codes and predict difficult ones. Extensive experiments demonstrate the superiority of our method over state-of-the-art methods in real-world dehazing.

Method Overview

MY ALT TEXT

Overview of our IPC-Dehaze. In the training phase, we use fused tokens Zt=ZlM+Zc(1M) from the hazy and clean images, and predict the sequence codes S by Code-Predictor. We also train Code-Critic to evaluate each code in set S for potential rejection and resampling. In the inference phase, Zt=0 is initially encoded as low-quality tokens Zl. During the t-th iterative decoding step, the Code-Predictor takes Zt as input, predicting the sequence codes S and the corresponding high-quality tokens Zc. To retain the reliable codes and resample the others, the Code-Critic evaluates S and produces a mask map M by pϕ. This mask map M is then used to generate Zt+1 through a Fusion process. Following T iterations, ZT is output to reconstruct the clean image by a decoder. The SFT refers to the Spatial Feature Transform, which adjusts the feature within the encoder and decoder.

Visual Comparison

Image 1
Image 2

Quantitative Comparison

MY ALT TEXT

BibTeX

@Article{fu2025iterative,
title={Iterative Predictor-Critic Code Decoding for Real-World Image Dehazing},
author={Fu, Jiayi and Liu, Siyu and Liu, Zikun and Guo, Chun-Le and Park, Hyunhee and Wu, Ruiqi and Wang, Guoqing and Li, Chongyi},
journal={arXiv preprint arXiv:2503.13147},
year={2025}}

Acknowledgements

We thank Ziheng Zhang and Xin Jin for the visual comparison component.