MRBG VLOG: ChatGPT gains agentic capability for complex research

OpenAI Unveils Deep Research: A Breakthrough in AI-Powered Research Capabilities

OpenAI has introduced a groundbreaking agentic capability called Deep Research, designed to enable ChatGPT to conduct complex, multi-step research tasks online. This new feature reportedly accomplishes in minutes what might take human researchers hours or even days.

According to OpenAI, Deep Research represents a significant milestone in the company’s ongoing pursuit of artificial general intelligence (AGI).

“The ability to synthesise knowledge is a prerequisite for creating new knowledge,” OpenAI states. “For this reason, deep research marks a significant step toward our broader goal of developing AGI.”

A New Era of AI-Assisted Research

Deep Research empowers ChatGPT to autonomously find, analyse, and synthesise information from hundreds of online sources. With just a user prompt, the tool can generate comprehensive reports comparable to those produced by research analysts.

Built on a variant of OpenAI’s upcoming “o3” model, Deep Research aims to eliminate the time-consuming, labour-intensive process of information gathering. Whether for competitive industry analysis, informed policy reviews, or highly specific product recommendations, the tool delivers precise, well-documented results.

Every output includes full citations and transparent documentation, ensuring users can easily verify findings. OpenAI highlights that deep research excels in uncovering niche or non-intuitive insights, making it valuable across industries like finance, science, policymaking, and engineering. However, the company also envisions its usefulness for everyday users, such as shoppers seeking personalised recommendations.

One example from OpenAI CEO Sam Altman illustrates the tool’s effectiveness:

“I am in Japan right now and looking for an old NSX. I spent hours searching unsuccessfully for the perfect one. I was about to give up, and Deep Research just... found it.”

Seamless Integration with ChatGPT

Deep research is integrated directly into the ChatGPT interface. Users simply select the “Deep Research” option in the message composer and enter their query. They can also upload supporting files or spreadsheets to provide additional context.

Once initiated, the AI embarks on a rigorous multi-step process that may take 5–30 minutes to complete. A sidebar provides real-time updates on actions taken and sources consulted, allowing users to continue with other tasks until the final report is ready.

Reports are presented within the chat, offering detailed, well-documented insights. In the coming weeks, OpenAI plans to enhance these reports with embedded images, data visualisations, and graphs for improved clarity and context.

Unlike GPT-4o, which specialises in real-time, multimodal conversations, Deep Research prioritises in-depth analysis and rigorous citation. This positions it as a tool for those requiring research-grade insights rather than quick summaries.

Built for Real-World Challenges

Deep research leverages sophisticated training methodologies grounded in real-world browsing and reasoning tasks. The model was trained using reinforcement learning to autonomously plan and execute multi-step research processes, including adaptive refinement as new information emerges.

The tool can:

Browse user-uploaded files
Generate and iterate on graphs using Python
Embed media such as generated images and web pages into responses
Cite exact sentences or passages from sources

This extensive training has resulted in an AI capable of tackling complex, real-world problems.

To evaluate its capabilities, OpenAI tested Deep Research against a rigorous expert-level benchmark known as “Humanity’s Last Exam.” Comprising over 3,000 questions spanning disciplines from rocket science to linguistics, the benchmark assesses an AI’s ability to solve multifaceted problems.

Deep Research delivered record-breaking results, achieving an accuracy of 26.6%, far surpassing other models:

GPT-4o: 3.3%
Grok-2: 3.8%
Claude 3.5 Sonnet: 4.3%
OpenAI o1: 9.1%
DeepSeek-R1: 9.4%
Deep Research: 26.6% (with browsing + Python tools)

Additionally, Deep Research set a new state-of-the-art performance on the GAIA benchmark, which evaluates reasoning, multi-modal fluency, and tool-use proficiency. It secured the top score of 72.57%.

Challenges and Limitations

Despite its impressive capabilities, deep research is not without its challenges. OpenAI acknowledges that the system still has limitations, including occasional hallucinations (incorrect or misleading information), difficulty distinguishing authoritative sources from speculative content, and overconfidence in uncertain findings.

Users may also experience minor formatting errors in reports and citations, as well as occasional delays in task initiation. However, OpenAI expects these issues to improve through iterative updates and increased usage.

Gradual Rollout and Future Enhancements

OpenAI is rolling out deep research gradually, starting with Pro users, who will receive up to 100 queries per month. Plus and Team tiers will follow, with Enterprise access arriving thereafter.

Currently, residents of the UK, Switzerland, and the European Economic Area do not have access to the feature, though OpenAI is working on expanding availability to these regions.

In the coming weeks, Deep Research will be integrated into ChatGPT’s mobile and desktop platforms. Looking ahead, OpenAI plans to connect the tool to subscription-based or proprietary data sources, further enhancing its reliability and personalisation.

Additionally, OpenAI envisions integrating deep research with its operator chatbot, which can take real-world actions. This future enhancement could enable ChatGPT to seamlessly handle tasks that require both in-depth online research and real-world execution.

Want to stay ahead in AI and big data? Check out the AI & Big Data Expo in Amsterdam, California, and London. The event is co-located with other leading conferences, including the Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

MRBG VLOG

Wednesday, February 12, 2025

ChatGPT gains agentic capability for complex research

A New Era of AI-Assisted Research

Seamless Integration with ChatGPT

Built for Real-World Challenges

Challenges and Limitations

Gradual Rollout and Future Enhancements

No comments:

Post a Comment

Labels