DiscoverDaily Paper CastSpotEdit: Selective Region Editing in Diffusion Transformers
SpotEdit: Selective Region Editing in Diffusion Transformers

SpotEdit: Selective Region Editing in Diffusion Transformers

Update: 2025-12-31
Share

Description

🤗 Upvotes: 27 | cs.CV, cs.AI



Authors:

Zhibin Qin, Zhenxiong Tan, Zeqing Wang, Songhua Liu, Xinchao Wang



Title:

SpotEdit: Selective Region Editing in Diffusion Transformers



Arxiv:

http://arxiv.org/abs/2512.22323v1



Abstract:

Diffusion Transformer models have significantly advanced image editing by encoding conditional images and integrating them into transformer layers. However, most edits involve modifying only small regions, while current methods uniformly process and denoise all tokens at every timestep, causing redundant computation and potentially degrading unchanged areas. This raises a fundamental question: Is it truly necessary to regenerate every region during editing? To address this, we propose SpotEdit, a training-free diffusion editing framework that selectively updates only the modified regions. SpotEdit comprises two key components: SpotSelector identifies stable regions via perceptual similarity and skips their computation by reusing conditional image features; SpotFusion adaptively blends these features with edited tokens through a dynamic fusion mechanism, preserving contextual coherence and editing quality. By reducing unnecessary computation and maintaining high fidelity in unmodified areas, SpotEdit achieves efficient and precise image editing.

Comments 
In Channel
loading
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

SpotEdit: Selective Region Editing in Diffusion Transformers

SpotEdit: Selective Region Editing in Diffusion Transformers

Jingwen Liang, Gengyu Wang