Abstract: We address the problem of generating realistic 3D humanobject interactions (HOIs) driven by textual prompts. To this end, we take a modular design and decompose the complex task into simpler ...