publications | Bingliang Zhang

2025

CVPR (Oral)
Improving diffusion inverse problem solving with decoupled noise annealing

Bingliang Zhang, Wenda Chu, Julius Berner, Chenlin Meng, Anima Anandkumar, and Yang Song

In Proceedings of the Computer Vision and Pattern Recognition Conference , 2025

Abs Bib PDF Code Website

Diffusion models have recently achieved success in solving Bayesian inverse problems with learned data priors. Current methods build on top of the diffusion sampling process, where each denoising step makes small modifications to samples from the previous step. However, this process struggles to correct errors from earlier sampling steps, leading to worse performance in complicated nonlinear inverse problems, such as phase retrieval. To address this challenge, we propose a new method called Decoupled Annealing Posterior Sampling (DAPS) that relies on a novel noise annealing process. Specifically, we decouple consecutive steps in a diffusion sampling trajectory, allowing them to vary considerably from one another while ensuring their time-marginals anneal to the true posterior as we reduce noise levels. This approach enables the exploration of a larger solution space, improving the success rate for accurate reconstructions. We demonstrate that DAPS significantly improves sample quality and stability across multiple image restoration tasks, particularly in complicated nonlinear inverse problems.
@inproceedings{zhang2025improving, title = {Improving diffusion inverse problem solving with decoupled noise annealing}, author = {Zhang, Bingliang and Chu, Wenda and Berner, Julius and Meng, Chenlin and Anandkumar, Anima and Song, Yang}, booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference}, pages = {20895--20905}, year = {2025}, tag1 = {CVPR (Oral)}, }
ICLR (Spotlight)
InverseBench: Benchmarking Plug-and-Play Diffusion Models for Scientific Inverse Problems

Hongkai Zheng, Wenda Chu , Bingliang Zhang, Zihui Wu, Austin Wang, Berthy Feng, Caifeng Zou, Yu Sun, and 3 more authors

In The Thirteenth International Conference on Learning Representations , 2025

Abs Bib PDF Code Website

Plug-and-play diffusion models have emerged as a promising research direction for solving inverse problems. However, current studies primarily focus on natural image restoration, leaving the performance of these algorithms in scientific inverse problems largely unexplored. To address this gap, we introduce InverseBench, a framework that evaluates diffusion models across five distinct scientific inverse problems. These problems present unique structural challenges that differ from existing benchmarks, arising from critical scientific applications such as optical tomography, medical imaging, black hole imaging, seismology, and fluid dynamics. With InverseBench, we benchmark 14 inverse problem algorithms that use plug-and-play diffusion models against strong, domain-specific baselines, offering valuable new insights into the strengths and weaknesses of existing algorithms.
@inproceedings{zhenginversebench, title = {InverseBench: Benchmarking Plug-and-Play Diffusion Models for Scientific Inverse Problems}, author = {Zheng, Hongkai and Chu, Wenda and Zhang, Bingliang and Wu, Zihui and Wang, Austin and Feng, Berthy and Zou, Caifeng and Sun, Yu and Kovachki, Nikola Borislavov and Ross, Zachary E and others}, booktitle = {The Thirteenth International Conference on Learning Representations}, year = {2025}, tag1 = {ICLR (Spotlight)}, }

2024

NeurIPS
Principled probabilistic imaging using diffusion models as plug-and-play priors

Zihui Wu, Yu Sun, Yifan Chen , Bingliang Zhang, Yisong Yue, and Katherine Bouman

Advances in Neural Information Processing Systems, 2024

Abs Bib PDF Code Website

Diffusion models (DMs) have recently shown outstanding capabilities in modeling complex image distributions, making them expressive image priors for solving Bayesian inverse problems. However, most existing DM-based methods rely on approximations in the generative process to be generic to different inverse problems, leading to inaccurate sample distributions that deviate from the target posterior defined within the Bayesian framework. To harness the generative power of DMs while avoiding such approximations, we propose a Markov chain Monte Carlo algorithm that performs posterior sampling for general inverse problems by reducing it to sampling the posterior of a Gaussian denoising problem. Crucially, we leverage a general DM formulation as a unified interface that allows for rigorously solving the denoising problem with a range of state-of-the-art DMs. We demonstrate the effectiveness of the proposed method on six inverse problems (three linear and three nonlinear), including a real-world black hole imaging problem. Experimental results indicate that our proposed method offers more accurate reconstructions and posterior estimation compared to existing DM-based imaging inverse methods.
@article{wu2024principled, title = {Principled probabilistic imaging using diffusion models as plug-and-play priors}, author = {Wu, Zihui and Sun, Yu and Chen, Yifan and Zhang, Bingliang and Yue, Yisong and Bouman, Katherine}, journal = {Advances in Neural Information Processing Systems}, volume = {37}, pages = {118389--118427}, year = {2024}, tag1 = {NeurIPS}, }

2023

CVPR
Multi-concept customization of text-to-image diffusion

Nupur Kumari , Bingliang Zhang, Richard Zhang, Eli Shechtman, and Jun-Yan Zhu

In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2023

Abs Bib PDF Code Website

While generative models produce high-quality images of concepts learned from a large-scale database, a user often wishes to synthesize instantiations of their own concepts (for example, their family, pets, or items). Can we teach a model to quickly acquire a new concept, given a few examples? Furthermore, can we compose multiple new concepts together? We propose Custom Diffusion, an efficient method for augmenting existing text-to-image models. We find that only optimizing a few parameters in the text-to-image conditioning mechanism is sufficiently powerful to represent new concepts while enabling fast tuning ( 6 minutes). Additionally, we can jointly train for multiple concepts or combine multiple fine-tuned models into one via closed-form constrained optimization. Our fine-tuned model generates variations of multiple new concepts and seamlessly composes them with existing concepts in novel settings. Our method outperforms or performs on par with several baselines and concurrent works in both qualitative and quantitative evaluations while being memory and computationally efficient.
@inproceedings{kumari2023multi, title = {Multi-concept customization of text-to-image diffusion}, author = {Kumari, Nupur and Zhang, Bingliang and Zhang, Richard and Shechtman, Eli and Zhu, Jun-Yan}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, pages = {1931--1941}, year = {2023}, tag1 = {CVPR}, }
ICCV
Ablating concepts in text-to-image diffusion models

Nupur Kumari , Bingliang Zhang, Sheng-Yu Wang, Eli Shechtman, Richard Zhang, and Jun-Yan Zhu

In Proceedings of the IEEE/CVF International Conference on Computer Vision , 2023

Abs Bib PDF Code Website

Large-scale text-to-image diffusion models can generate high-fidelity images with powerful compositional ability. However, these models are typically trained on an enormous amount of Internet data, often containing copyrighted material, licensed images, and personal photos. Furthermore, they have been found to replicate the style of various living artists or memorize exact training samples. How can we remove such copyrighted concepts or images without retraining the model from scratch? To achieve this goal, we propose an efficient method of ablating concepts in the pretrained model, i.e., preventing the generation of a target concept. Our algorithm learns to match the image distribution for a target style, instance, or text prompt we wish to ablate to the distribution corresponding to an anchor concept. This prevents the model from generating target concepts given its text condition. Extensive experiments show that our method can successfully prevent the generation of the ablated concept while preserving closely related concepts in the model.
@inproceedings{kumari2023ablating, title = {Ablating concepts in text-to-image diffusion models}, author = {Kumari, Nupur and Zhang, Bingliang and Wang, Sheng-Yu and Shechtman, Eli and Zhang, Richard and Zhu, Jun-Yan}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision}, pages = {22691--22702}, year = {2023}, tag1 = {ICCV}, }

2021

ICLR
Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization

Zihan Zhou, Wei Fu , Bingliang Zhang , and Yi Wu

In International Conference on Learning Representations , 2021

Abs Bib PDF

We present Reward-Switching Policy Optimization (RSPO), a paradigm to discover diverse strategies in complex RL environments by iteratively finding novel policies that are both locally optimal and sufficiently different from existing ones. To encourage the learning policy to consistently converge towards a previously undiscovered local optimum, RSPO switches between extrinsic and intrinsic rewards via a trajectory-based novelty measurement during the optimization process. When a sampled trajectory is sufficiently distinct, RSPO performs standard policy optimization with extrinsic rewards. For trajectories with high likelihood under existing policies, RSPO utilizes an intrinsic diversity reward to promote exploration. Experiments show that RSPO is able to discover a wide spectrum of strategies in a variety of domains, ranging from single-agent particle-world tasks and MuJoCo continuous control to multi-agent stag-hunt games and StarCraftII challenges.
@inproceedings{zhou2021continuously, title = {Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization}, author = {Zhou, Zihan and Fu, Wei and Zhang, Bingliang and Wu, Yi}, booktitle = {International Conference on Learning Representations}, year = {2021}, tag1 = {ICLR}, }