AFUN predictions across diverse scenes. Pick a scene below, then choose a language query to see the matching prediction. Points inside the predicted affordance mask are highlighted in red, and the trajectory threads from yellow (contact) to blue (end). drag to orbit, scroll to zoom.
Start End
Without any robot-specific finetuning, AFUN predicts a precise functional mask and 3D motion that the robot uses to plan and execute manipulation in the real world. The same model generalizes across object categories, language instructions, and embodiments, suggesting a practical path toward open-world affordance models that unify functionality perception with executable action.
@article{wang_afun,
title = {{AFUN}: Towards an Affordance Foundation Model for
Functionality Understanding},
author = {Wang, Zhaoning and Zhong, Yi and Fu, Jiawei and
Christensen, Henrik I. and Gao, Jun},
note = {$^*$Equal contribution: Z.~Wang and Y.~Zhong.}
}