Home > Research > Publications & Outputs > 6d-diff

Electronic data

Links

Text available via DOI:

View graph of relations

6d-diff: A keypoint diffusion framework for 6d object pose estimation

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published

Standard

6d-diff: A keypoint diffusion framework for 6d object pose estimation. / Xu, Li; Qu, Haoxuan; Cai, Yujun et al.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024. 2024.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

Xu, L, Qu, H, Cai, Y & Liu, J 2024, 6d-diff: A keypoint diffusion framework for 6d object pose estimation. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024. https://doi.org/10.1109/CVPR52733.2024.00924

APA

Xu, L., Qu, H., Cai, Y., & Liu, J. (2024). 6d-diff: A keypoint diffusion framework for 6d object pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024 https://doi.org/10.1109/CVPR52733.2024.00924

Vancouver

Xu L, Qu H, Cai Y, Liu J. 6d-diff: A keypoint diffusion framework for 6d object pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024. 2024 Epub 2024 Jun 16. doi: 10.1109/CVPR52733.2024.00924

Author

Xu, Li ; Qu, Haoxuan ; Cai, Yujun et al. / 6d-diff : A keypoint diffusion framework for 6d object pose estimation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024. 2024.

Bibtex

@inproceedings{606432586c5b46ae90561594920d40cb,
title = "6d-diff: A keypoint diffusion framework for 6d object pose estimation",
abstract = "Estimating the 6D object pose from a single RGB image often involves noise and indeterminacy due to challenges such as occlusions and cluttered backgrounds. Mean-while, diffusion models have shown appealing performance in generating high-quality images from random noise with high indeterminacy through step-by-step denoising. Inspired by their denoising capability, we propose a novel diffusion-based framework (6D-Diff) to handle the noise and indeterminacy in object pose estimation for better performance. In our framework, to establish accurate 2D-3D correspondence, we formulate 2D keypoints detection as a reverse diffusion (denoising) process. To facilitate such a denoising process, we design a Mixture-of-Cauchy-based forward diffusion process and condition the reverse process on the object appearance features. Extensive experiments on the LM-O and YCB-V datasets demonstrate the effectiveness of our framework.",
author = "Li Xu and Haoxuan Qu and Yujun Cai and Jun Liu",
year = "2024",
month = sep,
day = "16",
doi = "10.1109/CVPR52733.2024.00924",
language = "English",
booktitle = "Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024",

}

RIS

TY - GEN

T1 - 6d-diff

T2 - A keypoint diffusion framework for 6d object pose estimation

AU - Xu, Li

AU - Qu, Haoxuan

AU - Cai, Yujun

AU - Liu, Jun

PY - 2024/9/16

Y1 - 2024/9/16

N2 - Estimating the 6D object pose from a single RGB image often involves noise and indeterminacy due to challenges such as occlusions and cluttered backgrounds. Mean-while, diffusion models have shown appealing performance in generating high-quality images from random noise with high indeterminacy through step-by-step denoising. Inspired by their denoising capability, we propose a novel diffusion-based framework (6D-Diff) to handle the noise and indeterminacy in object pose estimation for better performance. In our framework, to establish accurate 2D-3D correspondence, we formulate 2D keypoints detection as a reverse diffusion (denoising) process. To facilitate such a denoising process, we design a Mixture-of-Cauchy-based forward diffusion process and condition the reverse process on the object appearance features. Extensive experiments on the LM-O and YCB-V datasets demonstrate the effectiveness of our framework.

AB - Estimating the 6D object pose from a single RGB image often involves noise and indeterminacy due to challenges such as occlusions and cluttered backgrounds. Mean-while, diffusion models have shown appealing performance in generating high-quality images from random noise with high indeterminacy through step-by-step denoising. Inspired by their denoising capability, we propose a novel diffusion-based framework (6D-Diff) to handle the noise and indeterminacy in object pose estimation for better performance. In our framework, to establish accurate 2D-3D correspondence, we formulate 2D keypoints detection as a reverse diffusion (denoising) process. To facilitate such a denoising process, we design a Mixture-of-Cauchy-based forward diffusion process and condition the reverse process on the object appearance features. Extensive experiments on the LM-O and YCB-V datasets demonstrate the effectiveness of our framework.

U2 - 10.1109/CVPR52733.2024.00924

DO - 10.1109/CVPR52733.2024.00924

M3 - Conference contribution/Paper

BT - Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024

ER -