[AI News] Anthropic Accelerates, ChatGPT’s New Search, Wharton’s AI Adoption Report, LLM Benchmarking, and Elon on AI Robots
From Text Generation to Task Execution: Anthropic’s Ambitious Advances with Claude 3.5
In the large-language model landscape, two major players dominate the conversation: OpenAI and Anthropic. While OpenAI has held the spotlight for its expansive set of GPT models, especially with the release of the o1 models, Anthropic has recently stepped up with a series of notable innovations.
One of the most impactful releases has been an update to Claude 3.5 Sonnet, Anthropic's flagship model. Claude 3.5 Sonnet has quickly become a strong contender, excelling not only in writing quality but also in code generation and agentic workflows. Its versatility marks a significant milestone in natural language processing, providing developers with sophisticated tools for generating code as well as supporting creative writing. This model offers a compelling alternative to OpenAI's offerings for those who need balanced strengths across different applications.
But Anthropic didn’t stop there. Alongside Claude 3.5 Sonnet, they introduced a groundbreaking experimental feature that allows their models to control the computer using mouse and keyboard commands. This new level of interactivity pushes the boundaries of what large language models can do. Anthropic is opening the doors to a wide range of automation and agentic possibilities.
Additionally, Anthropic has introduced a new feature that allows Claude to run the code it generates. The new “Analysis Tool” enables Claude to write and execute JavaScript code, similar to ChatGPT’s code interpreter, which uses Python. This tool is currently in preview, so users should expect some rough edges.
These recent innovations from Anthropic are not merely iterative updates—they represent a vision for what large language models can evolve into: multifaceted agents that extend beyond generating text into interactive functionality.
My only qualm with Anthropic is around naming. Not sure why they didn't go with 3.6 rather than updating Sonnet 3.5. 🤷
AI News: Quick Hits
OpenAI Unveils ChatGPT Search: OpenAI’s new ChatGPT Search feature integrates real-time web browsing, offering users web-sourced images, links, and tailored follow-up prompts.
Agentforce by Salesforce: Agentforce is now generally available, allowing businesses to build and deploy AI agents that can autonomously take action across any business function.
Google Q3 Earnings and AI Impact: In Q3 earnings, Sundar Pichai announced that 25% of new code at Google is now written by AI. Google Cloud also reported a 35% revenue growth year-over-year, primarily driven by generative AI and infrastructure.
OpenAI’s Custom AI Chip Plans: OpenAI is developing its first custom semiconductor chip in collaboration with Broadcom and TSMC and is expanding its chip supplier base to include AMD. This development is part of a strategy to manage high compute costs.
Wharton Report on AI Adoption: A new Wharton report studying 800+ senior business leaders shows AI adoption is maturing but selective. While overall usage is up dramatically, companies are getting strategic—focusing on specific high-value tasks like data analysis, idea generation, and contract drafting rather than trying to transform everything at once.
Coinbase Launches Based Agent: Coinbase launched ‘Based Agent,’ a tool allowing users to create AI-powered crypto trading bots with on-chain capabilities in under three minutes using OpenAI and Replit integration.
Disney’s AI Initiative: Disney is reportedly preparing to unveil a major AI initiative focused on post-production and VFX workflows, marking the content giant’s first major embrace of the tech.
OpenAI’s System Two Thinking: At TED AI, OpenAI’s Noam Brown discussed the “System Two Thinking” model, which prioritizes deliberate reasoning over sheer data processing. The model achieved 83% accuracy on complex tasks and could revolutionize sectors like healthcare with a strategic approach.
Apple’s Apple Intelligence Suite: Apple introduced its AI suite across iPhone, iPad, and Mac. The suite includes enhanced Siri capabilities, language processing, and photo search improvements, with a focus on privacy through on-device processing.
Meta’s Search Engine Plans: Meta is reportedly developing its own search infrastructure to reduce its dependence on Google, backed by new web-crawling tech, in response to increasing demand for real-time information across its platforms.
Elon Musk’s Prediction on Humanoid Robots: Elon Musk predicted that by 2040, there will be at least 10 billion humanoid robots priced between $20K and $25K.
GitHub’s Expanding AI Strategy: GitHub’s AI strategy now supports Anthropic’s Claude and Google’s Gemini models alongside GPT for Copilot. Additionally, GitHub has launched ‘Spark,’ a tool allowing users to create micro-apps using natural language.
Elon Musk’s xAI Fundraising: Musk’s xAI is reportedly seeking a new funding round that would value the startup at $40 billion, up from $24 billion following its raise in May.
Osmo Gives Computers a Sense of Smell with AI: Osmo announced a breakthrough in ‘scent teleportation,’ demonstrating AI’s ability to analyze, digitize, and reproduce the smell of a plum. Their proprietary system uses the world’s largest scent database and plans for public demos, potentially releasing a limited-edition fragrance.
Content I Enjoyed
Check out this interview with the maintainers of the popular LLM Chatbot Arena. This is one of the best benchmarks we have for evaluating LLMs. Unlike standard benchmarks, the Chatbot Arena pairs LLM output with human evaluators. What’s cool is that all of this started as a way to solve their own problem of evaluating their custom LLMs.
AI Art & Humor
Happy Halloween in the age of GenAI!
Source: @andr3_ai
That’s all from me. Thank you, and I’ll chat with you next week!
-Manny