CAD-guided 6D pose estimation with deep learning in digital twin for industrial collaborative robot manipulation

Quang Huan Dong; The Thinh Pham; Khanh Nguyen; Chi-Cuong Tran; Hoang Huy Tran; Duy Tan Do; Khang Hoang Vinh Nguyen; Quang-Chien Ngyuyen

airo 25(1):

Research Article

CAD-guided 6D pose estimation with deep learning in digital twin for industrial collaborative robot manipulation

Download33 downloads

Cite: BibTeX Plain Text

@ARTICLE{10.4108/airo.9676,
    author={Quang Huan Dong and The Thinh Pham and Khanh Nguyen and Chi-Cuong Tran and Hoang Huy Tran and Duy Tan Do and Khang Hoang Vinh Nguyen and Quang-Chien Ngyuyen},
    title={CAD-guided 6D pose estimation with deep learning in digital twin for industrial collaborative robot manipulation},
    journal={EAI Endorsed Transactions on AI and Robotics},
    volume={4},
    number={1},
    publisher={EAI},
    journal_a={AIRO},
    year={2025},
    month={10},
    keywords={pose estimation, computer vision, digital twin, industrial robot},
    doi={10.4108/airo.9676}
}

Quang Huan Dong
The Thinh Pham
Khanh Nguyen
Chi-Cuong Tran
Hoang Huy Tran
Duy Tan Do
Khang Hoang Vinh Nguyen
Quang-Chien Ngyuyen
Year: 2025
CAD-guided 6D pose estimation with deep learning in digital twin for industrial collaborative robot manipulation
AIRO
EAI
DOI: 10.4108/airo.9676

Quang Huan Dong¹, The Thinh Pham², Khanh Nguyen¹^,*, Chi-Cuong Tran³, Hoang Huy Tran¹, Duy Tan Do⁴, Khang Hoang Vinh Nguyen¹, Quang-Chien Ngyuyen⁵

1: Vietnamese-German University
2: Can Tho University of Technology
3: National Taiwan University of Science and Technology
4: Ho Chi Minh City University of Technology
5: Ho Chi Minh City University of Technology and Education

*Contact email: khanh.nt@vgu.edu.vn

Abstract

6D pose estimation in the bin-picking task has attracted increasing attention from researchers. CAD model-based method have been proposed, demonstrating its effectiveness. However, most existing research relies on point cloud registration from the RGB-D camera, which is often not robust to noise and low-light conditions, leading to degraded point cloud quality and reduced accuracy. Thereby, the method accuracy is significantly affected. Moreover, detecting objects correctly plays a vital role in multiple objects. Supervised deep learning takes consideration into this task, but it typically requires a large amount of labeled data. In industrial environments, sample collection and model retraining are limited. To address these challenges, we introduce the potential approach that integrates the zero-shot learning YOLOE and DEFOM-Stereo model. The YOLOE detects and localizes the object without requiring object-specific training, while DEFOM-Stereo generates point clouds for the CAD model-based pose estimation. Extensive experiments demonstrate that the proposed approach achieves high accuracy in pose estimation, which is essential for grasp planning and manipulation tasks in robotics. Furthermore, the proposed approach is applied in a Unity3D-based digital twin, enabling enhanced virtual representation of a physical pickup target with an estimated pose. Hence, the research result supports more accurate and responsive digital twins for robotics toward the development of smart manufacturing systems.

Keywords: pose estimation, computer vision, digital twin, industrial robot

Received: 2025-07-05
Accepted: 2025-09-04
Published: 2025-10-02
Publisher: EAI

: http://dx.doi.org/10.4108/airo.9676

Copyright © 2025 Quang Huan Dong, et al., licensed to EAI. This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.

CAD-guided 6D pose estimation with deep learning in digital twin for industrial collaborative robot manipulation

Abstract

About EAI

Community

Publish with EAI