Claude Opus 4.5 propels Anthropic up the AI rankings

The company's new flagship model, Claude Opus 4.5, was just published by Anthropic. It competes with Gemini 3 and GPT-5.1 for top performance overall, especially on coding and agentic benchmarks.

The specifics:

In addition to setting new benchmarks for tool use, reasoning, and problem-solving, Opus is the first to surpass 80% on the SWE-Bench Verified coding benchmark.

According to Anthropic, the model is the "most robustly aligned model" in terms of safety, matching or surpassing Google's Gemini 3 in a number of benchmarks.

Anthropic positioned the flagship model as a central coordinator for multi-agent systems by designing Opus to coordinate groups of smaller Haiku models.

The cost of the Opus 4.5 is significantly lower than that of the Opus 4.1 by 66%, and it exhibits significant efficiency improvements over Anthropic's previous models.

Anthropic also released enhancements, such as extended access to Claude for Chrome and Excel, limitless conversation durations, and Claude Code in desktop.

Just days after GPT 5.1 Pro and Gemini 3 joined the market, Opus 4.5 arrives in a busy week for frontier AI, signaling the next advancement in the hunt for frontier AI models. For Anthropic, which has frequently been criticized for Claude's expenses in relation to the market, the price drop is also a significant step.