We propose a novel 6-DoF pose estimation approach: Coordinates-based Disentangled Pose Network (CDPN), which disentangles the pose to predict rotation and translation separately to achieve highly accurate and robust pose estimation.
6-DoF object pose estimation from a single RGB imageis a fundamental and long-standing problem in computervision. Current leading approaches solve it by trainingdeep networks to either regress both rotation and translationfrom image directly or to construct 2D-3D correspondencesand further solve them via PnP indirectly. We arguethat rotation and translation should be treated differentlyfor their significant difference. In this work, we propose anovel 6-DoF pose estimation approach: Coordinates-basedDisentangled Pose Network (CDPN), which disentanglesthe pose to predict rotation and translation separately toachieve highly accurate and robust pose estimation. Ourmethod is flexible, efficient, highly accurate and can dealwith texture-less and occluded objects. Extensive experimentson LINEMOD and Occlusion datasets are conductedand demonstrate the superiority of our approach. Concretely,our approach significantly exceeds the state-of-theartRGB-based methods on commonly used metrics.