CoIE: Chain-of-Instruct Editing for Multi-Attribute Face Manipulation

Zhang, Zhenduo; Zhang, Bo-Wen; Liu, Guang

Computer Science > Computer Vision and Pattern Recognition

arXiv:2312.07879 (cs)

[Submitted on 13 Dec 2023 (v1), last revised 20 Dec 2023 (this version, v2)]

Title:CoIE: Chain-of-Instruct Editing for Multi-Attribute Face Manipulation

Authors:Zhenduo Zhang, Bo-Wen Zhang, Guang Liu

View PDF HTML (experimental)

Abstract:Current text-to-image editing models often encounter challenges with smoothly manipulating multiple attributes using a single instruction. Taking inspiration from the Chain-of-Thought prompting technique utilized in language models, we present an innovative concept known as Chain-of-Instruct Editing (CoIE), which enhances the capabilities of these models through step-by-step editing using a series of instructions. In particular, in the context of face manipulation, we leverage the contextual learning abilities of a pretrained Large Language Model (LLM), such as GPT-4, to generate a sequence of instructions from the original input, utilizing a purpose-designed 1-shot template. To further improve the precision of each editing step, we conduct fine-tuning on the editing models using our self-constructed instruction-guided face editing dataset, Instruct-CelebA. And additionally, we incorporate a super-resolution module to mitigate the adverse effects of editability and quality degradation. Experimental results across various challenging cases confirm the significant boost in multi-attribute facial image manipulation using chain-of-instruct editing. This is evident in enhanced editing success rates, measured by CLIPSim and Coverage metrics, improved by 17.86% and 85.45% respectively, and heightened controllability indicated by Preserve L1 and Quality metrics, improved by 11.58% and 4.93% respectively.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2312.07879 [cs.CV]
	(or arXiv:2312.07879v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2312.07879

Submission history

From: Zhenduo Zhang [view email]
[v1] Wed, 13 Dec 2023 03:48:45 UTC (13,334 KB)
[v2] Wed, 20 Dec 2023 08:53:40 UTC (13,334 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:CoIE: Chain-of-Instruct Editing for Multi-Attribute Face Manipulation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:CoIE: Chain-of-Instruct Editing for Multi-Attribute Face Manipulation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators