GPT-5.2-Codex Benchmark Results
OpenAI released GPT-5.2-Codex on January 14, 2026, five weeks after the base GPT-5.2 model. It targets agentic coding: multi-step sessions where the model plans, writes code, runs tests, and iterates on failures.
The model scores 56.4% on SWE-Bench Pro (up from 55.6% on base GPT-5.2) and 64.0% on Terminal-Bench 2.0 (up from 62.2%). Both benchmarks test real-world coding tasks, not isolated code generation.
GPT-5.2-Codex vs GPT-5.2 vs Claude Opus 4.6
| Benchmark | GPT-5.2-Codex | GPT-5.2 | Claude Opus 4.6 |
|---|---|---|---|
| SWE-Bench Pro | 56.4% | 55.6% | — |
| Terminal-Bench 2.0 | 64.0% | 62.2% | #1 |
| Context Window (input) | 400K | 128K | 200K (1M beta) |
| Output Tokens | 128K | 128K | 128K |
GPT-5.2-Codex balances cost and performance. Claude Opus 4.6 leads Terminal-Bench 2.0 and Humanity's Last Exam, while GPT-5.2-Codex competes on price and context window size.
Key Features for Developers
Context Compaction
Like Claude Opus 4.6's compaction feature, GPT-5.2-Codex compresses earlier context while preserving task state. This enables multi-hour coding sessions where the model tracks the full project even as the conversation exceeds the context window.
Long-Horizon Task Completion
The model is optimized for tasks spanning many steps: large refactors, codebase migrations, and multi-file feature implementations. When an approach fails, GPT-5.2-Codex adjusts and retries rather than restarting the task.
Built-In Vulnerability Detection
GPT-5.2-Codex includes vulnerability detection during code generation. Teams needing deeper scanning can use dedicated tools like Claude Code Security, which offers multi-stage verification with false positive filtering.
Windows Environment Support
OpenAI improved GPT-5.2-Codex's Windows development performance, addressing the Unix-centric optimization of earlier models.
GPT-5.2-Codex Pricing
| Tier | Cost per Million Tokens |
|---|---|
| Input | $1.75 |
| Output | $14.00 |
| Cached Input | $0.175 (90% discount) |
GPT-5.2-Codex is available across all Codex surfaces for paid ChatGPT users and as a standalone API model.
What GPT-5.2-Codex Means for Agentic Coding
The release reflects an industry-wide shift from code completion to sustained coding agents. OpenAI's Codex, Anthropic's Claude Code, and GitHub Agentic Workflows all target multi-step engineering tasks with minimal human intervention.
Original source
https://openai.com/index/introducing-gpt-5-2-codex/Frequently Asked Questions
What is GPT-5.2-Codex?
How much does GPT-5.2-Codex cost?
What is context compaction in GPT-5.2-Codex?
How does GPT-5.2-Codex compare to Claude Opus 4.6?
Stay Updated
Get the latest AI news delivered to your inbox.
