OpenAI Ships o3 and o4-mini — Reasoning Models Get Full Tool Capabilities
o3 and o4-mini launch with web search, code execution, and file analysis. Codex CLI goes open source. Deep research expands to all paid tiers.
Maya Johnson
OpenAI released o3 and o4-mini on April 16, 2025 — reasoning models with full tool capabilities for the first time. Previous reasoning models (o1, o3-mini) could only think; these can also search the web, execute code, and analyze files, according to OpenAI's changelog.
Why This Matters
The gap between reasoning models and general-purpose models has been that reasoning models could think deeply but couldn't act. o3 changes that. It can reason through a complex problem, search the web for current data, write and execute code to verify its answer, and read uploaded files — all within a single response.
This effectively merges the o-series reasoning capability with GPT-4o's tool use, creating a model that's both smarter and more capable than either line was independently.
Codex CLI Goes Open Source
Alongside the model launch, OpenAI released Codex CLI as an open-source tool — an agent-style coding assistant that runs in local terminal environments. This positioned it directly against Anthropic's Claude Code, which was also terminal-native.
A dedicated gpt-5-codex model launched in September for the Codex CLI, showing OpenAI's commitment to the product line.
Deep Research Expansion
Deep research — an AI-powered research mode that conducts multi-step investigations — expanded to all paid tiers: Plus, Team, Enterprise, and Edu (25 queries/month), Pro (250/month), and Free (5/month). Previously limited to Pro subscribers, the expansion made it the most widely accessible advanced AI research tool.
The Reasoning Race
At the time of launch, o3 set new records on math (AIME) and science (GPQA) benchmarks. Google had released Gemini 2.5 Pro with thinking mode in March. Claude had Opus 4 and Sonnet 4 coming in May. The reasoning race was accelerating, with each company shipping models that could both think and act.
o3-pro followed on June 10 with increased compute for harder reasoning problems, and o3 pricing was reduced the same day.
Our Take
o3 with tool use is the template for what all AI models will eventually look like: deep reasoning combined with the ability to access external information and execute code. The open-source Codex CLI was a smart competitive move, beating Claude Code on openness. But the reasoning models' real test is whether the accuracy gains from chain-of-thought translate to measurably better outcomes in real workflows, not just benchmark scores.
FAQ
What is o3? o3 is OpenAI's reasoning model released April 16, 2025. It combines deep chain-of-thought reasoning with tool capabilities including web search, code execution, and file analysis.
What's the difference between o3 and GPT-5? o3 is a reasoning-first model that excels at complex problem-solving tasks. GPT-5 (released August 2025) is a general-purpose model with optional reasoning. GPT-5 later unified both approaches into a single architecture.
What is Codex CLI? Codex CLI is OpenAI's open-source, terminal-based coding agent. It lets developers use AI to read, write, and execute code directly from the command line, similar to Anthropic's Claude Code.