Let your LLM see — and interact with — what it's working on. Capture and compare any GUI (desktop apps, web pages, settings panels, or the full screen), then feed the results to describe-this-picture ...