OpenAI Joins the AI Agent Race with Operator Agent

5 min readJan 28, 2025

After a lot of anticipation and many rumors, OpenAI finally released Operator, a new AI assistant that can independently browse and interact with the web. The company announced this research preview as part of its growing suite of AI tools, marking a shift from AI that simply processes information to AI that can take action online.

Operator uses a new system called the Computer-Using Agent (CUA) to see and interact with web pages. Unlike traditional AI assistants that need special programming connections to work with websites, Operator can understand what it sees on screen and interact naturally through clicks, scrolls, and typing — much like a human would.

The system’s capabilities extend to everyday tasks that often consume users’ time. During testing, Operator showed it could handle various online activities from booking travel and ordering groceries to filling out forms and creating online content. Users can also set their preferences for specific websites, allowing the AI to follow their exact specifications when completing tasks.

OpenAI has made this initial release available to ChatGPT Pro users in the United States through operator.chatgpt.com. The company plans a gradual rollout, first expanding to Plus, Team, and Enterprise users before eventually integrating Operator’s capabilities directly into ChatGPT. However, availability in Europe will take longer due to regional considerations.

Early performance metrics show promising results, with Operator achieving an 87% success rate on real-world websites through the WebVoyager benchmark. The release positions OpenAI in direct competition with similar tools from Anthropic (Claude Computer Use) and Google (Project Mariner). The company has also secured partnerships with major businesses including DoorDash, Instacart, OpenTable, and others to help achieve a smooth integration with existing online services.

Industry Impact

The development of autonomous web-browsing AI agents continues to expand, with OpenAI’s Operator joining similar tools from Google (Project Mariner) and Anthropic (Claude Computer Use). Each system aims to enable AI to interact with websites directly, marking a shift from traditional AI assistants that rely on specific API integrations.

Several businesses have begun testing these autonomous agents. Companies including DoorDash, Instacart, OpenTable, Priceline, and StubHub are working to understand how such tools interact with their existing web services. The public sector has also started exploring applications, with the City of Stockton examining potential uses for improving access to civic services.

These early adopters are testing whether autonomous agents can effectively handle routine online tasks while maintaining security and user privacy standards.

Technical breakdown

The CUA model combines visual processing capabilities with reinforcement learning to navigate web interfaces. The system processes screenshots of web pages, analyzes the visual information, and executes actions through virtual inputs like mouse clicks and keyboard entries.

Performance testing reveals varying capabilities across different scenarios:

The WebVoyager benchmark, which tests interactions with real-world websites, shows an 87% success rate. However, on WebArena, which uses simulated websites for testing customer service scenarios, the success rate drops to 58.1%.
More complex tasks, such as managing PDFs from emails in the OSWorld benchmark, achieve a 38.1% success rate.
The current implementation includes several safety mechanisms. A “takeover mode” activates for sensitive information entry, requiring direct user input for passwords or payment details. “Watch mode” enables user supervision on sensitive websites. A monitoring system tracks for unusual activity patterns and can pause operation if needed.
Privacy features allow users to opt out of data collection for model training. Users can also delete their browsing data and conversation history. The system includes protections against potentially harmful websites, with automated detection systems for suspicious activity.

These early performance metrics and safety measures provide insight into both the current state of autonomous web agents and the technical challenges still facing their development.

Personal Experience

I have only been using Operator for a couple of days at the moment of writing, and my experience has been mixed. While the tool upholds its promise to operate autonomously, it can be quite slow at times, taking several minutes to complete tasks that usually require just a few mouse clicks, and having several limitations.

For example, when asked to find flights from San Francisco International to Tokyo in late September 2025, Operator completed the task in about 10 minutes. While this is acceptable, it’s noticeably slower compared to what a human could achieve. Additionally, it defaulted to Tokyo Haneda instead of Narita, which offers more flight options.

On the bright side, Operator runs in the background, allowing me to focus on other tasks. However, this feature could also tempt users into extreme and unproductive multitasking or lead to forgetting time-sensitive tasks.

One natural limitation of the tool is its inability to access password-protected pages. This is understandable, but it poses a drawback in many online shopping and eCommerce scenarios, such as searching on Amazon or eBay.

I will add new personal comments as I continue to use Operator, so check this page again to see what a $200-a-month tool might offer typical users.

Market implications

The increased availability of autonomous web agents points to significant changes in how businesses and consumers will interact online. Companies will need to consider how their web services accommodate both human and AI users. This will naturally expand the definition of UX and how UI elements will be implemented.

Integration with existing services requires careful consideration of current web standards and security protocols. Many websites use systems like CAPTCHA or specific security measures that can challenge AI agents. This has prompted discussions about developing new standards for authenticating automated interactions while maintaining security.

The geographic rollout of these tools follows a staged approach. Initial availability in the United States serves as a testing ground, with planned expansion to other regions. European deployment faces additional considerations due to regional data protection regulations and privacy requirements.

For developers and businesses interested in building their own applications, API access to these technologies is in development. This could enable the creation of specialized tools for specific industries or use cases, though timeline details remain undefined.

The broader AI industry implications extend beyond web browsing. The development of AI systems like Operator that can interact with existing interfaces, rather than requiring specialized integrations, suggests a shift in how AI tools will be deployed across various applications.

However, significant technical and practical challenges remain. These include improving reliability across different types of websites, maintaining security standards, and ensuring appropriate human oversight of AI actions. How OpenAI and the wider industry addresses these challenges will shape the trajectory of autonomous AI development.

I will post more examples of how to use Operator, its pros and cons. So check back regularly. Also, have an interesting task you’d like to see Operator tackle? Drop a comment below to share your ideas for me to test and highlight in future editions.

Keep a lookout for the next edition of AI Uncovered!

Follow our social channels for more AI-related content: LinkedIn; Twitter (X); Bluesky; Threads; and Instagram.

OpenAI Joins the AI Agent Race with Operator Agent

Industry Impact

Technical breakdown

Personal Experience

Market implications

Written by Giancarlo Mori

No responses yet