Quantitative comparison

on Open3DHOI[4] in the wild

LEXIS-Flow sets SotA on in-the-wild Open3DHOI[4].

Generative vs learning-based
init object shape
(SAM3D)
Refinement vs optimization-based
'Expert' init for all
(CameraHMR & SAM3D)
Method $\mathrm{CD}_{\mathrm{hum}}$ ↓ $\mathrm{CD}_{\mathrm{obj}}$ ↓ Collision ↓ Contact ↑
AHDM[7] (w. scale align.) 13.5049.380.0890.141
BLEXIS-Flow (Ours) 8.8535.010.0600.211
CInteractVLM[6] 7.2038.200.0540.372
DCameraHMR[1] + SAM3D[2] 7.2037.300.0510.182
EHOI-Gaussian[5] 7.2832.020.0610.151
FInteractVLM++[6] 7.2030.110.0470.394
GLEXIS-Flow* (Ours) 7.0522.960.0410.451
Ours best on all metrics
[1] CameraHMR, Patel et al 3DV'25
[2] SAM3D, SAM3D team arXiv'25
[4] Open3DHOI, Wen et al CVPR'25
[5] HOI-Gaussian, Wen et al CVPR'25
[6] InteractVLM, Dwivedi et al CVPR'25
[7] HDM, Xie et al CVPR'24