OpenAI Launches Codex Security to Find Vulnerabilities Before Hackers Do

OpenAI has launched Codex Security, an AI-powered application security agent now available in research preview for ChatGPT Pro, Enterprise, Business, and Edu customers. The tool uses OpenAI's frontier models to scan codebases, build threat models, identify vulnerabilities, and propose fixes — with free usage for the first month.
What Codex Security Actually Does
Unlike most AI security tools that flood developers with low-confidence findings and false positives, Codex Security takes a different approach. It builds deep context about your specific project before scanning, creating an editable threat model that captures what your system does, what it trusts, and where it's most exposed.
The three-step process works like this: First, it analyzes your repository to understand the security-relevant structure and generates a project-specific threat model. Second, it searches for vulnerabilities and categorizes findings based on real-world impact, pressure-testing them in sandboxed environments. Third, it proposes fixes that align with your system's architecture, minimizing the risk of regressions.
The Numbers So Far
In its beta phase, Codex Security scanned over 1.2 million commits across external repositories, identifying 792 critical findings and 10,561 high-severity findings. Critical issues appeared in under 0.1% of scanned commits. The tool has cut noise by 84% since initial rollout, reduced over-reported severity findings by more than 90%, and dropped false positive rates by more than 50%.
In early internal deployments at OpenAI, it surfaced a real SSRF vulnerability, a critical cross-tenant authentication flaw, and multiple other issues that the security team patched within hours.
Who's Using It
Early beta customers include NETGEAR, vLLM, and Raptive. The tool was previously known internally as "Aardvark" before being rebranded as Codex Security for the public launch.
Why This Matters
As AI coding agents accelerate software development, security review is becoming a critical bottleneck. Developers are shipping code faster than security teams can review it. Codex Security aims to close that gap by automating vulnerability discovery and triage, letting human security teams focus on the findings that actually matter.
The tool also learns from feedback — when you adjust the criticality of a finding, it refines its threat model and improves precision on subsequent runs.
The Bottom Line
OpenAI entering the application security space makes strategic sense: if AI is accelerating code production, it should also accelerate security review. The 84% noise reduction and 50% false positive improvement are meaningful metrics that address the #1 complaint about existing security scanning tools. The real test will be whether it can catch the subtle, context-dependent vulnerabilities that are the hardest to find and the most dangerous to miss. Free for a month is a smart play — once security teams integrate it into their workflow, switching costs make it sticky.