Anthropic’s AI update can use a computer on its own!

lilxyzz

Travis

Posted on October 23, 2024

Anthropic’s AI update can use a computer on its own!

Anthropic’s New ‘Computer Use’ Feature for Claude AI Now Available for Developers: Boost Your Automation Capabilities

Anthropic has unveiled a cutting-edge feature in its latest Claude 3.5 Sonnet AI model, currently in public beta, that enables the AI to interact with a computer much like a human. This "computer use" feature allows Claude to move a cursor, click buttons, and type text simply by "looking" at the screen. Now available via the API, developers can use this capability to automate tasks on computers, as demonstrated on a Mac in the video below.

While competitors like Microsoft’s Copilot Vision, OpenAI’s desktop app for ChatGPT, and Google’s Gemini app on Android phones offer AI tools that can interpret your computer’s screen, none have yet released full-scale tools capable of performing tasks directly for users like this. Rabbit promised similar capabilities with its R1 model, though it has yet to deliver.

That said, Anthropic emphasizes that the "computer use" feature remains experimental and may be "cumbersome and error-prone." The company is releasing it early for developer feedback, with expectations for rapid refinement.

According to Anthropic:

There are many actions that people routinely do with computers (dragging, zooming, and so on) that Claude can’t yet attempt. The 'flipbook' nature of Claude’s view of the screen—taking screenshots and piecing them together, rather than observing a more granular video stream—means that it can miss short-lived actions or notifications.

Additionally, Claude has been programmed to avoid social media engagement, with “measures to monitor when Claude is asked to engage in election-related activity, as well as systems for nudging Claude away from activities like generating and posting content on social media, registering web domains, or interacting with government websites.”

Key Enhancements (Based on Multiple User Reports):

  • Significantly faster response generation

  • Enhanced reasoning with self-correction ("let me rethink this...")

  • Improved code generation and debugging

  • Analytical depth now closer to Claude Opus/o1-mini

  • More direct responses with less apologetic tone

  • New explicit warnings about potential hallucinations on obscure topics


Beyond this, the Claude 3.5 Sonnet model brings significant performance improvements, offered to customers at the same price and speed as its predecessor:

The updated Claude 3.5 Sonnet shows wide-ranging improvements on industry benchmarks, with particularly strong gains in agentic coding and tool use tasks. On coding, it improves performance on SWE-bench Verified from 33.4% to 49.0%, scoring higher than all publicly available models—including reasoning models like OpenAI o1-preview and specialized systems designed for agentic coding. It also improves performance on TAU-bench, an agentic tool use task, from 62.6% to 69.2% in the retail domain, and from 36.0% to 46.0% in the more challenging airline domain.

Anthropic’s AI Update

Image: From Anthropic

Important Notes:

  • Changes seem limited to the web interface; API users have reported no noticeable differences.

  • Some users have observed a reduction in context window sizes for free accounts.

  • There’s no official confirmation from Anthropic on these changes.

  • Certain IDE integrations (e.g., Cursor) are experiencing bugs.

  • User experiences vary between accounts.

These updates reflect Anthropic’s ongoing efforts to enhance Claude’s functionality, both in terms of AI-driven task automation and its coding and reasoning capabilities.

Before you go please consider supporting by giving a Hart, Share, or Follow!

💖 💪 🙅 🚩
lilxyzz
Travis

Posted on October 23, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related