Home > Research > Publications & Outputs > Unpaired 3D Shape-to-Shape Translation via Grad...

Electronic data

  • TVCG-2024-01-0036.R1_Proof_hi

    Accepted author manuscript, 4.53 MB, PDF document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License

Links

Text available via DOI:

View graph of relations

Unpaired 3D Shape-to-Shape Translation via Gradient-Guided Triplane Diffusion

Research output: Contribution to Journal/MagazineJournal articlepeer-review

E-pub ahead of print
<mark>Journal publication date</mark>28/01/2025
<mark>Journal</mark>IEEE Transactions on Visualization and Computer Graphics
Number of pages13
Publication StatusE-pub ahead of print
Early online date28/01/25
<mark>Original language</mark>English

Abstract

Unpaired shape-to-shape translation refers to the task of transforming the geometry and semantics of an input shape into a new shape domain without paired training data. Previous methods utilize GAN-based architectures to perform shape translation, employing adversarial training to transform the source shape encoding into the target domain in the low-dimensional latent feature space. However, these methods encounter difficulties in generating diverse and high-quality results, as they often suffer from issues such as “mode collapse”. This leads to limited generation diversity and makes it challenging to find an accurate latent code that adequately represents the input shape. In this paper, we achieve unpaired shape-to-shape translation via a triplane diffusion model, in which we factorize 3D objects into triplane representations and conduct a diffusion process on these representations to accomplish shape domain transformation. We observe that by adding an appropriate amount of noise to an input object during the forward diffusion process, domain-specific shape structures are smoothed out while the overall structure is still preserved. Subsequently, we progressively remove the noise via an unconditional diffusion model trained on the target shape domain in the reverse diffusion process. This allows us to obtain a denoised output that retains the structural similarities of the source input while aligning with the distribution of the target shape domain. During this process, we propose two gradient-based guidance mechanisms to guide the translation process to guarantee more faithful results during the denoising process. We conduct extensive experiments on different shape domains, and the experimental results demonstrate that our method achieves superior shape fidelity with high quality compared to current state-of-the-art baselines.