GPT-5.6 Can Spot Security Holes But Can't Exploit Them Solo
OpenAI's new GPT-5.6 model family found vulnerabilities in testing but couldn't pull off full autonomous attacks on hardened systems.
OpenAI just dropped GPT-5.6 — and with it, some interesting security findings. The company's internal testing revealed that its new flagship models could identify vulnerabilities in targets but failed to execute autonomous, end-to-end attacks against hardened systems.
The GPT-5.6 release is actually a three-model family. Sol sits at the top as the flagship. Terra offers strong capability at a lower price point. Luna rounds things out as the fastest and most cost-efficient option.
The security disclosure matters. It means these models are getting smart enough to spot weaknesses in systems — a useful capability for defensive security work — but aren't yet capable of chaining together full attack sequences without human intervention. That's a meaningful distinction as AI safety debates intensify around frontier model capabilities.
OpenAI publishing these findings signals continued commitment to transparency around what its most powerful models can and can't do.