OpenAI Releases 'Deep Research', Sam Altman Says It Can Do "Single-digit Percentage" Of All Economically Valuable Tasks

OpenAI had seen its thunder taken away by DeepSeek, but it’s looks to be hitting back with a ‘deep’ release of its own — Deep Research.

OpenAI has launched a new product named Deep Research, which aims to perform complex research and reasoning tasks. Built on top of OpenAI’s o3 model, Deep Research allows users to send in their queries, and the agent combines web-browsing with deep thinking to come up with very detailed answers. Sam Altman said that Deep Research can do a “single digit percentage of all economically-viable tasks in the world”.

“Today we launch deep research, our next agent,” Altman posted on X. “This is like a superpower; experts on demand! It can go use the internet, do complex research and reasoning, and give you back a report. It is really good, and can do tasks that would take hours/days and cost hundreds of dollars,” he added.

“My very approximate vibe is that it can do a single-digit percentage of all economically valuable tasks in the world, which is a wild milestone,” he claimed.

Deep Research is live in OpenAI’s pro tier ($200/month) and will allow users access to 100 queries per month. In the plus tier ($20/month), Deep Research will cost $10 per month, and the product have some queries available in OpenAI’s free tier as well.

In Deep Research’s demo, OpenAI’s researchers asked it to find target markets for their apps, or buy skis in Japan. The model first comes up with clarifying questions, and then spends quite a bit of time to come up with its output. But its output is quite comprehensive — Deep Research browses the internet, uses its own data, and can come up with 10,000 word-answers with tables and research.

Deep Research broke records on some benchmarks as well. On humanity’s last exam, which consists of 3,000 multiple-choice and short answer questions on 100 subjects ranging from linguistics and rocket science, Deep Research scored 25.1 percent, which was the highest for an AI model. In comparison, GPT 4-o had scored 3.3%, Grok 2 had scored 3.8%, Claude Sonnet 3.5 had scored 4.3%, Gemini Thinking had scored 6.2% and DeepSeek-R1 had scored 9.2%. OpenAI’s own o3-mini models, released just days ago, had scored between 10.5 and 13%.

This feels like a pretty important release from OpenAI. This is the first time users will be able to indirectly access OpenAI’s o3 model, which many had speculated to be AGI when its benchmark results were released last month. Also, it combines o3 with the ability to browse the web in real time, which could make it especially powerful — it could come in handy for research, coming up with new ideas, and deeply understanding new subjects. The release is also crucial because it shows OpenAI’s capabilities beyond o1, which has already been equaled by Chinese company DeepSeek. And with Sam Altman claiming that the model can do a single digit percentage of all economically viable tasks, the model could end up being quite consequential in the ongoing AI revolution.