Anthropic's Mythos Preview Crushes Coding Benchmarks, Launches Cybersecurity Push

Anthropic's unreleased Mythos Preview model hits 93.9% on SWE-bench Verified, obliterating Opus 4.6's 80.8% score.

Anthropic's Mythos Preview Crushes Coding Benchmarks, Launches Cybersecurity Push

Anthropic just dropped some jaw-dropping benchmark numbers. Its unreleased Mythos Preview model scored 93.9% on SWE-bench Verified — a massive leap over Opus 4.6's 80.8%. On the tougher SWE-bench Pro, Mythos hit 77.8% compared to Opus 4.6's 53.4%. That's nearly a 25-point gap.

The new model debuted alongside Project Glasswing, a broad cybersecurity initiative pairing frontier AI with security applications. Details remain thin, but pairing a model this capable with defensive cyber tooling signals Anthropic is betting big on enterprise security use cases.

The SWE-bench results position Mythos as a serious contender for best-in-class coding AI. Whether those benchmark gains translate to real-world reliability is the billion-dollar question — but the raw numbers are hard to argue with.