Supported by Fastmail
Sponsor: Fastmail

Fast, private email hosting for you or your business. Try Fastmail free for up to 30 days.

Anthropic Releases Claude Opus 4.8

Anthropic:

We’re upgrading Claude Opus to a new version: Claude Opus 4.8. It builds on Opus 4.7 with improvements across benchmarks, and is a more effective collaborator. It’s available today for the same price.

Opus 4.8 launches alongside several new features. Users on claude.ai now have control over the amount of effort Claude puts into a task. Claude Code has a new “dynamic workflows” feature that allows it to tackle very large-scale problems. And fast mode for Opus 4.8—where the model can work at 2.5× the speed—is now three times cheaper than it was for previous models.

I almost didn’t link up this announcement, but then I saw this:

One of the most prominent improvements in Opus 4.8 is its honesty. We train all our models to be honest—for instance, to avoid making claims that they can’t support. But a general problem with AI models is that they sometimes jump to conclusions, confidently claiming to have made progress in their work despite the evidence being thin. Early testers report that Opus 4.8 is more likely to flag uncertainties about its work and less likely to make unsupported claims. This is borne out in our evaluations, which show that Opus 4.8 is around four times less likely than its predecessor to allow flaws in code it has written to pass unremarked.

Earlier this week I was using Claude Sonnet 4.6 to assist with a technical investigation, and it made a massive assumption from the jump before spitting out nearly 700 words of completely wrong analysis. I don’t know that Claude Opus 4.8 would have avoided jumping to conclusions and taking a massively wrong path, but it’s fascinating that this “confidence” gap is being called out.

⚙︎

Subscribe to JAG’s Workshop to get new posts by email, and follow JAG’s Workshop using RSS, Mastodon, Bluesky, or LinkedIn . You can also support the site with a one-time tip of any amount.