Google (GOOGL) Launches New Gemini 2.5 Model to Help AI Agents Use Websites
 TipRanks
15.7K Followers
Story by Vince Condarcuri • 12h
Tech giant Google (GOOGL) has launched the Gemini 2.5 Computer Use model, which is a new version of its Gemini 2.5 Pro system that’s designed to help AI agents interact directly with websites and apps. Available through the Gemini API in Google AI Studio and Vertex AI, the model can complete everyday digital tasks by clicking, typing, scrolling, and filling out forms like a person would. This goes beyond standard APIs and allows AI to work through visual interfaces. Although it’s mainly optimized for web browsers, it also shows strong potential for mobile apps.
Interestingly, the model works by receiving a user request, a screenshot of what’s on the screen, and a history of previous actions. Based on this input, it generates an action—like clicking a button or typing in a form—and may ask for confirmation before doing something sensitive, like making a purchase. Once an action is done, a new screenshot is sent back to the model to continue the process. This loop repeats until the task is finished.
To make sure the system is safe, Google built protections into the model, such as a safety check before every action and developer controls to block risky behaviors. In addition, teams inside Google are already using it for faster software testing, while external testers have used it for personal assistants and workflow automation. For example, Google’s payments team used Gemini to fix over 60% of failed tests that used to take days to resolve. As a result, the model is now in public preview, and developers can try it out using demos, guides, and tools shared by Google. |