Video-Grounded GUI Agent Benchmark — A benchmark for evaluating vision-language models on GUI automation tasks using video tutorial guidance. A full episode from VG-GUI-Bench ("How to save emails as ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results