Generative models have achieved significant progress in advancing 2D image editing, demonstrating exceptional precision and realism. However, they often struggle with consistency and object identity preservation due to their inherent pixel-manipulation nature. To address this limitation, we introduce a novel "2D-3D-2D" framework. Our approach begins by lifting 2D objects into 3D representation, enabling edits within a physically plausible, rigidity-constrained 3D environment. The edited 3D objects are then reprojected and seamlessly inpainted back into the original 2D image. In contrast to existing 2D editing methods, such as DragGAN and DragDiffusion, our method directly manipulates objects in a 3D environment. Extensive experiments highlight that our framework surpasses previous methods in general performance, delivering highly consistent edits while robustly preserving object identity.
@article{xie2025spaceedit,
title={2D Instance Editing in 3D Space},
author={Xie, Yuhuan and Pan, Aoxuan and Lin, Ming-Xian and Huang, Wei and Huang, Yi-Hua and Qi, Xiaojuan},
journal={arXiv preprint arXiv:2507.05819},
year={2025}
}