New Claude 4 AI mannequin refactored code for 7 hours straight

On Thursday, Anthropic launched Claude Opus 4 and Claude Sonnet 4, marking the corporate’s return to bigger mannequin releases after primarily specializing in mid-range Sonnet variants since June of final yr. The brand new fashions symbolize what the corporate calls its most succesful coding fashions but, with Opus 4 designed for advanced, long-running duties that may function autonomously for hours.

Alex Albert, Anthropic’s head of Claude Relations, advised Ars Technica that the corporate selected to revive the Opus line due to rising demand for agentic AI purposes. “Throughout all the businesses on the market which can be constructing issues, there is a actually massive wave of those agentic purposes bobbing up, and a really excessive demand and premium being positioned on intelligence,” Albert mentioned. “I feel Opus goes to suit that groove completely.”

Earlier than we go additional, a short refresher on Claude’s three AI mannequin “dimension” names (launched in March 2024) might be warranted. Haiku, Sonnet, and Opus supply a tradeoff between value (within the API), pace, and functionality.

Haiku fashions are the smallest, least costly to run, and least succesful when it comes to what you would possibly name “context depth” (contemplating conceptual relationships within the immediate) and encoded information. Owing to the small dimension in parameter rely, Haiku fashions retain fewer concrete details and thus are likely to confabulate extra often (plausibly answering questions primarily based on lack of information) than bigger fashions, however they’re much sooner at fundamental duties than bigger fashions. Sonnet is historically a mid-range mannequin that hits a stability between price and functionality, and Opus fashions have at all times been the most important and slowest to run. Nevertheless, Opus fashions course of context extra deeply and are hypothetically higher suited to operating deep logical duties.

A screenshot of the Claude net interface with Opus 4 and Sonnet 4 choices proven.

Credit score:

Anthropic

There is no such thing as a Claude 4 Haiku simply but, however the brand new Sonnet and Opus fashions can reportedly deal with duties that earlier variations couldn’t. In our interview with Albert, he described testing situations the place Opus 4 labored coherently for as much as 24 hours on duties like taking part in Pokémon whereas coding refactoring duties in Claude Code ran for seven hours with out interruption. Earlier Claude fashions sometimes lasted just one to 2 hours earlier than shedding coherence, Albert mentioned, which means that the fashions might solely produce helpful self-referencing outputs for that lengthy earlier than starting to output too many errors.

{content material}

Supply: {feed_title}

What's Hot

How one can wager racing: Every part you must wager Monaco, Indy 500, NASCAR

‘This isn’t a fast repair’

Broadcom has allegedly hiked VMware prices between 800 and 1,500%

New Claude 4 AI mannequin refactored code for 7 hours straight

Broadcom has allegedly hiked VMware prices between 800 and 1,500%

Netflix’s ‘stellar’ new American Manhunt season has 100% on Rotten Tomatoes

6 Greatest Webcams (2025), Examined and Reviewed

How one can wager racing: Every part you must wager Monaco, Indy 500, NASCAR

‘This isn’t a fast repair’

Broadcom has allegedly hiked VMware prices between 800 and 1,500%

Netflix’s ‘stellar’ new American Manhunt season has 100% on Rotten Tomatoes

US debt fears put greenback on monitor for worst week since tariffs sell-off

6 Greatest Webcams (2025), Examined and Reviewed

Will Penske dishonest scandal overshadow IndyCar crown jewel?

The Finest Journey Tech for Households (2025)

Trump plans to impose 50% tariff on EU imports from subsequent month

Xiaomi’s new Tesla-rivaling SUV has a cockpit-style panoramic windshield show and might cost in simply 12 minutes

Latest Posts

How one can wager racing: Every part you must wager Monaco, Indy 500, NASCAR

‘This isn’t a fast repair’

Broadcom has allegedly hiked VMware prices between 800 and 1,500%

Netflix’s ‘stellar’ new American Manhunt season has 100% on Rotten Tomatoes

US debt fears put greenback on monitor for worst week since tariffs sell-off

What's Hot

New Claude 4 AI mannequin refactored code for 7 hours straight

Share this:

Related Posts

Share this: