PhoneBuddy: Agentic Phone Use

Upload a phone screenshot and an instruction. The model predicts the next action (click, swipe, type, etc.) and visualizes it on the screenshot.

Model: PhoneBuddy-4B-RealApp | Paper: arXiv:2606.23049 | Code: GitHub

Examples