Negative Token Merging: Image-based Adversarial Feature Guidance

Abstract

Text-based adversarial guidance using a negative prompt has emerged as a widely adopted approach to push the output features away from undesired concepts. While useful, performing adversarial guidance using text alone can be insufficient to capture complex visual concepts and avoid undesired visual elements like copyrighted characters. In this paper, for the first time we explore an alternate modality in this direction by performing adversarial guidance directly using visual features from a reference image or other images in a batch. In particular, we introduce negative token merging (NegToMe), a simple but effective training-free approach which performs adversarial guidance by selectively pushing apart matching semantic features (between reference and output generation) during the reverse diffusion process. When used w.r.t. other images in the same batch, we observe that NegToMe significantly increases output diversity (racial, gender, visual) without sacrificing output image quality. Similarly, when used w.r.t. a reference copyrighted asset, NegToMe helps reduce visual similarity with copyrighted content by 34.57%. NegToMe is simple to implement using just few-lines of code, uses only marginally higher (<4%) inference times and generalizes to different diffusion architectures like Flux, which do not natively support the use of a separate negative prompt.

TLDR; We propose an alternate modality to traditional text (negative-prompt) based adversarial guidance, by directly using visual features from a reference image to guide the generation process.

How does NegToMe work?

Method Overview (up). (a) The core idea of NegToMe is to perform adversarial guidance directly using visual features from a reference image (or other images in the same batch). (b) NegToMe is simple and can be applied in any transformer block. (c) A simple three-step process is used for performing adversarial guidance using NegToMe.

Implementation (left). NegToMe can be incorporated in most diffusion models using just a few lines of code.

Copyright Mitigation. state-of-the-art diffusion models (SDXL, Flux) can generate copyrighted characters even if the input-prompt does not explicitly mention the character name. Furthermore, performing copyright mitigation using negative prompt (i.e., adding character name to negative prompt) is often insufficient. NegToMe helps better reduce similarity with copyrighted characters by directly using visual features from a copyrighted retrieval database for adversarial guidance.

Adversarial Guidance for Style

NegToMe can also be used for style guidance: excluding specific artistic elements while still mainting desired image content. For instance, in above example the user wants a painting of a starry night without artistic elements from van-gogh style.

BibTeX

If you find our work useful in your research, please consider citing:

@article{singh2024negtome, title={Negative Token Merging: Image-based Adversarial Feature Guidance}, author={Singh, Jaskirat and Li, Lindsey and Shi, Weijia and Krishna, Ranjay and Choi, Yejin and Wei, Pang and Cohen, Michael and Gould, Stephen and Zheng, Liang and Zettlemoyer, Luke}, journal={arXiv preprint arXiv}, year={2024} }

Negative Token Merging: Image-based Adversarial Feature Guidance

Interactive Gallery (Output Diversity)

Abstract

How does NegToMe work?

Improving Output Diversity (Flux)

Improving Output Diversity (SDXL)

Copyright Mitigation

Improving Output Aesthetics and Details

Cross-Domain Adversarial Guidance

Adversarial Guidance for Style

BibTeX