OpenAI has launched a new AI-powered tool called ChatGPT agent, designed to carry out a variety of tasks on behalf of users. Unlike the traditional ChatGPT model, which simply answers questions, the ChatGPT agent goes a step further by automating actions. It can schedule appointments, create presentations, run code, and even access external apps like Gmail and GitHub. The tool is available to subscribers of OpenAI’s Pro, Plus, and Team plans, and can be activated by selecting “agent mode” in the ChatGPT menu.
The new agent combines features from previous OpenAI tools, such as Operator, which can navigate websites, and Deep Research, which synthesises information from multiple sources. ChatGPT agent is designed to interact with users using natural language, making it easier for people to complete tasks like planning meals or conducting competitive analysis. With the ability to access tools like terminals and APIs, the agent can execute more complex actions, such as writing code or gathering detailed information from various websites.
One of the standout features of the ChatGPT agent is its performance on AI benchmarks. It scored 41.6% on Humanity’s Last Exam, a challenging test across many subjects, which is about twice as high as previous models like o3 and o4-mini. In addition, on the FrontierMath benchmark, which tests math skills, the ChatGPT agent achieved a score of 27.4%, a huge leap from the 6.3% score of o4-mini. These results indicate that the ChatGPT agent has advanced capabilities compared to earlier AI models.
OpenAI has designed the ChatGPT agent with safety in mind, as it introduces new risks due to its ability to perform actions autonomously. The company has implemented additional safeguards, including a real-time monitor that checks if user prompts relate to potentially dangerous topics, such as biological threats. If a prompt triggers any concerns, the agent’s response is further evaluated. To reduce the risk of misuse, OpenAI has also disabled the memory feature in this version of ChatGPT, which typically allows the chatbot to remember past conversations. This decision is to prevent bad actors from using the tool to extract sensitive information.
While ChatGPT agent shows promise in automating tasks and solving complex problems, OpenAI acknowledges that real-world applications for AI agents have often been challenging. However, the company claims that this new model is significantly more capable, setting a high bar for AI agents in the future. Users will be able to see for themselves how well the ChatGPT agent performs, especially when it comes to more complex tasks and handling real-world scenarios.
The launch of the ChatGPT agent represents a significant step forward in OpenAI’s efforts to turn ChatGPT into a more autonomous, task-performing product. As AI technology advances, tools like ChatGPT agents could revolutionise how people interact with machines, making daily tasks and complex workflows easier to manage. Whether it lives up to its promises remains to be seen, but the potential for AI to take on more responsibilities is undoubtedly exciting.
Despite its potential, OpenAI’s approach also highlights the challenges of balancing innovation with safety. The company has been cautious about the risks that come with AI models capable of performing actions autonomously, and is putting in place strict safety measures to ensure that the technology is used responsibly. AI will continue to evolve, but how it’s developed and controlled will be key to shaping its future impact.
ChatGPT agent is a significant advancement in AI technology, offering a glimpse of what AI agents could look like in the future. With its ability to execute tasks, access external tools, and perform complex actions, it marks a bold step towards a more interactive and capable AI system. As OpenAI continues to refine and develop this tool, it could change the way users interact with AI in their daily lives.