Google has officially launched a "reimagined" version of its advanced AI research agent, Gemini Deep Research. Powered by its highly acclaimed state-of-the-art foundation model, Gemini 3 Pro, this new tool not only generates research reports but also allows developers to embed Google's sophisticated research capabilities directly into their own applications.
This embedding functionality is made possible through Google's new Interactions API, designed to give developers enhanced control in the burgeoning era of agentic AI. The Gemini Deep Research tool is engineered to synthesize vast amounts of information and manage extensive context within prompts. Google reports that customers are already leveraging it for a range of critical tasks, from due diligence to complex drug toxicity safety research.
Looking ahead, Google plans to integrate this deep research agent into several of its core services, including Google Search, Google Finance, the Gemini App, and its popular NotebookLM. This move represents a strategic step towards a future where AI agents, rather than humans, increasingly handle information retrieval and complex queries.
Combating AI Hallucinations with Gemini 3 Pro
A significant advantage of Deep Research stems from its foundation on Gemini 3 Pro, which Google describes as its "most factual" model. It has been specifically trained to minimize AI hallucinations—instances where large language models generate incorrect or fabricated information. This is a crucial concern for long-running, deep reasoning agentic tasks, where numerous autonomous decisions are made over extended periods. Even a single hallucinated choice can invalidate an entire output, making accuracy paramount.
To substantiate its progress claims, Google has introduced a new benchmark called DeepSearchQA, which is designed to test AI agents on complex, multi-step information-seeking tasks. This benchmark has been open-sourced by Google. Additionally, Deep Research was evaluated against independent benchmarks such as Humanity's Last Exam, known for its impossibly niche general knowledge tasks, and BrowserComp, which assesses browser-based agentic performance.
Benchmark Showdown and OpenAI's Simultaneous Launch
As anticipated, Google's new agent outperformed competitors on its proprietary DeepSearchQA and Humanity's Last Exam. However, OpenAI's ChatGPT 5 Pro emerged as a surprisingly close second across the board, even slightly surpassing Google on BrowserComp.
The competitive landscape, however, shifted dramatically almost immediately. On the very same day Google unveiled its Deep Research agent, OpenAI launched its highly anticipated GPT 5.2, codenamed "Garlic." OpenAI claims its newest model significantly outperforms rivals, particularly Google, across a suite of typical benchmarks, including its own homegrown evaluations.
The synchronized timing of these major AI announcements is particularly noteworthy. With the global tech community eagerly awaiting the release of "Garlic," Google strategically chose the same day to unveil its own significant advancements in AI research, intensifying the ongoing competition in the rapidly evolving artificial intelligence sector.








