Controllable Mind Visual Diffusion Model
DOI:
https://doi.org/10.1609/aaai.v38i7.28519Keywords:
CV: Computational Photography, Image & Video SynthesisAbstract
Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models. Diffusion-based methods have recently shown promise in analyzing functional magnetic resonance imaging (fMRI) data, including the reconstruction of high-quality images consistent with original visual stimuli. Nonetheless, it remains a critical challenge to effectively harness the semantic and silhouette information extracted from brain signals. In this paper, we propose a novel approach, termed as Controllable Mind Visual Diffusion Model (CMVDM). Specifically, CMVDM first extracts semantic and silhouette information from fMRI data using attribute alignment and assistant networks. Then, a control model is introduced in conjunction with a residual block to fully exploit the extracted information for image synthesis, generating high-quality images that closely resemble the original visual stimuli in both semantic content and silhouette characteristics. Through extensive experimentation, we demonstrate that CMVDM outperforms existing state-of-the-art methods both qualitatively and quantitatively. Our code is available at https://github.com/zengbohan0217/CMVDM.Downloads
Published
2024-03-24
How to Cite
Zeng, B., Li, S., Liu, X., Gao, S., Jiang, X., Tang, X., Hu, Y., Liu, J., & Zhang, B. (2024). Controllable Mind Visual Diffusion Model. Proceedings of the AAAI Conference on Artificial Intelligence, 38(7), 6935-6943. https://doi.org/10.1609/aaai.v38i7.28519
Issue
Section
AAAI Technical Track on Computer Vision VI