OpenAI CEO Sam Altman started this year by saying in a blog post that 2025 would be a big year for AI agents, tools that can automate tasks and take actions on your behalf.
Now we are seeing the first real attempt at OpenAI.
OpenAI announced Thursday the launch of a research preview of Operator, a general-purpose AI agent that can take control of a web browser and independently perform certain actions. The operator will come to US users first with ChatGPT’s $200 Pro subscription plan. OpenAI says it plans to extend this functionality to more users in the Plus, Team, and Enterprise tiers.
“(The operator) will be (in) other countries soon,” OpenAI CEO Sam Altman said during a livestream Thursday. “Europe, unfortunately, will take some time.”
This initial search preview is available via operator.chatgpt.com, but soon OpenAI says it plans to integrate Operator into all of its ChatGPT clients.
According to OpenAI, the operator promises to automate tasks such as travel accommodation booking, restaurant reservations and online shopping. There are several categories of tasks that users can choose from within the operator interface, including shopping, delivery, dining, and travel, all of which allow for different types of automation.
When ChatGPT users activate Operator, a small window will appear showing a dedicated web browser that the agent uses to complete tasks, along with explanations of the specific actions the agent is performing. Users can still take control of their screen while Operator is working, as Operator uses its own dedicated browser.
OpenAI says Operator is powered by a computer-using agent, or CUA, model that combines the vision capabilities of the company’s GPT-4o model with the reasoning capabilities of OpenAI’s more advanced models. CUA is trained to interact with the front end of websites, meaning it doesn’t need to use developer-facing APIs to tap into different services.
In other words, CUA can use buttons, navigate menus, and fill out forms on a web page just like a human would.
OpenAI says it is working with companies like DoorDash, eBay, Instacart, Priceline, StubHub, and Uber to ensure that the Operator complies with the terms of these companies’ service agreements.
“The CUA model is trained to ask the user for confirmation before finalizing tasks with external side effects, for example before placing an order, sending an email, etc., so that the user can double-check the work of the model before it becomes permanent.” OpenAI writes in materials provided to TechCrunch. “(It) has already proven useful in a variety of cases, and we aim to extend that reliability to a broader range of tasks.”
But OpenAI warns that CUA isn’t perfect. The company says it “(does not) expect (the) CUA to work reliably in all scenarios at this time.”
“Currently, Operator cannot reliably handle many complex or specialized tasks,” OpenAI adds in a support document, “such as creating detailed presentations, managing complex calendar systems, or interacting with web interfaces highly customized or non-standard.
Out of an abundance of caution, OpenAI also requires oversight of some tasks, such as banking transactions, that the CUA and Operator could mostly perform alone. Users will have to take control to enter credit card information, for example. OpenAI says Operator does not collect or screenshot any data.
“On particularly sensitive websites, such as email, Operator requires active user supervision, ensuring that users can directly spot and resolve any errors the model may make,” OpenAI says in its support materials.
This limits the usefulness of Operator, sure, but it also ensures that the agent doesn’t hallucinate and, for example, doesn’t spend the mortgage payment on accented chairs. Google has taken a similar approach with its Project Mariner AI agent, which doesn’t input information like credit card numbers.
Limitations
The operator has some notable limitations.
There are frequency limits, both daily and activity-dependent. OpenAI says that the Operator can perform multiple tasks at the same time, but that there are “dynamic limitations” in this regard. There is also an overall usage limit that resets daily.
In this release phase, the Operator will also refuse to perform tasks outright for security reasons, such as sending emails (despite the CUA being capable of doing so) and deleting calendar events. OpenAI says this will change in the future, but does not provide ETA.
The operator may also get “stuck” if they encounter a particularly complex interface, password field, or CAPTCHA control. It will ask the user to take over when this happens, OpenAI says.
A future agent
OpenAI has been quite slow to develop an AI agent compared to rivals (see: agents from Rabbit, Google and Anthropic), which may have something to do with security risks related to the technology.
When an AI system can perform actions on the web, it opens the door to much more dangerous use cases by malicious actors. You could automate AI agents to orchestrate phishing scams or DDoS attacks, or get them to grab concert tickets before anyone else does. Especially for a widely used tool like ChatGPT, it is important that OpenAI takes measures to prevent these types of exploits.
OpenAI appears to believe that Operator is safe enough to release in its current form, at least as a research preview.
“The operator uses tools that try to limit the model’s susceptibility to malicious requests, hidden instructions and phishing attempts,” OpenAI explains on its website. “A monitoring system suspends execution if suspicious activity is detected, while automated, human-reviewed processes continually update protection measures.”
Operator is OpenAI’s boldest attempt at creating an AI agent. Last week, OpenAI released Tasks, giving ChatGPT simple automation features like the ability to set reminders and schedule tasks to run at a set time each day.
The businesses gave ChatGPT users some familiar, but necessary, features to make ChatGPT as convenient to use as Siri or Alexa. However, Operator showcases features that the previous generation of virtual assistants could never offer.
AI agents have been touted as the next big thing in artificial intelligence after ChatGPT: a new technology that will change the way people use the Internet and their PCs. Instead of simply providing and processing information, agents can, in theory, take actions and actually do things.
With the release of the first concrete version of OpenAI’s agents, it will soon become clear how realistic this vision is.