[AI News] OpenAI’s Latest Launches, Google Gemini 2.0 Takes Center Stage, and Compact Model Innovations from Microsoft & Cohere

Dec 15, 2024

What a week! OpenAI’s 12 Days of Christmas is in full swing, dropping a new feature every day leading up to the holiday break. This week’s announcements didn’t disappoint.

Sora: OpenAI’s long anticipated video generation is finally out.
Canvas: This editable document feature for writing and coding is a game-changer. Every demo I’ve done leaves people amazed.
Voice Mode + Vision: ChatGPT's Advanced Voice Mode has recently been upgraded with vision capabilities.
ChatGPT Projects: Finally, a way to organize chats by topic—a top user request that seamlessly integrates into ChatGPT.

Meanwhile, Google made waves of its own:

Quantum Breakthrough: Google has unveiled Willow, a quantum computing chip that can perform calculations in under five minutes which would take classical supercomputers an estimated 10 septillion years.
Gemini 2.0: Google debuted its new LLM series with Gemini 2.0 Flash, a model that's twice as fast as Gemini 1.5 Pro, offering advanced multimodal capabilities including image and audio generation, alongside real-time interaction through a new Multimodal Live API. (More on this below)

With one week left in OpenAI’s 12 Days of Christmas, speculation about GPT-4.5 and other big updates is at an all-time high. It’s a fun time in AI—innovations are coming fast, and the competition is fierce. Can’t wait to see what’s next!

Now on to the rest of the newsletter!

-Manny

[GEMINI 2.0]

Google Unveils Gemini 2.0 Flash: Faster AI Model Brings Code Agents and Native Multimodal Features to Developers

The Recap: Google has announced Gemini 2.0 Flash Experimental, an upgraded AI model that builds on Gemini 1.5's success. The new version offers enhanced performance, new output capabilities, and developer-focused features like coding agents, marking a significant advancement in Google's AI offerings for developers.

Highlights:

Gemini 2.0 Flash is twice as fast as 1.5 Pro while achieving better performance across text, code, video, and spatial understanding benchmarks
New output capabilities include multilingual text-to-speech with 8 high-quality voices and native image generation with conversational editing
The model features native tool use, including Google Search integration and code execution capabilities
A new Multimodal Live API enables real-time applications with audio and video streaming inputs
Jules, an experimental AI-powered code agent, is being launched to handle Python and JavaScript coding tasks through GitHub
Colab is integrating new data science agent capabilities powered by Gemini 2.0 for automated notebook creation
The technology will be integrated into developer platforms like Android Studio, Chrome DevTools, and Firebase

Key Takeaway: Google's Gemini 2.0 Flash represents a significant step forward in making AI more accessible and powerful for developers. The addition of coding agents and improved multimodal capabilities suggests a future where AI becomes an increasingly integral part of the development workflow. While the technology shows promise in accelerating development and analysis tasks, its successful implementation will depend on developer adoption and real-world performance. The experimental phase will be crucial in determining how these tools reshape the development landscape. → Read the full announcement.

[AI AGENTS]

2024 State of AI Agents Survey Reveals Developer Preferences and Industry Trends

The Recap: Langbase's comprehensive survey of over 3,400 respondents from 100+ countries provides insights into AI agent development and deployment trends. With significant participation from C-level executives (46%), the survey reveals strong OpenAI dominance while highlighting emerging competitors and key challenges in AI implementation.

Highlights:

OpenAI leads LLM provider usage at 76%, followed by Google (59%) and Anthropic (47%), with Meta's Llama, Mistral, and Cohere holding smaller market shares.
Different providers show distinct strengths: OpenAI excels in tech and marketing, Google in health and translation, and Anthropic in technical tasks.
Developers strongly prefer flexible, composable primitives (76%) over prebuilt solutions (24%) for AI pipeline orchestration.
Top deployment challenges include scaling complexity, data privacy concerns, and lack of robust monitoring tools.
Multi-agent retrieval-augmented generation (RAG) capabilities emerge as the most critical infrastructure need.
The majority of developers are using AI for both experimentation and production purposes.
AI agent version control ranks as the most important feature when choosing development platforms.
Primary use cases concentrate in software development, marketing, IT operations, and text summarization.

Key Takeaways: The AI agent landscape is rapidly evolving with a clear preference for customizable, flexible solutions over rigid frameworks. While OpenAI maintains market leadership, the diversity of provider strengths suggests a maturing ecosystem. As companies move from experimentation to production, addressing deployment challenges and infrastructure needs will be crucial for widespread adoption in 2025 and beyond. → Read the full report here.

New Round Up 📰

Cohere Launches Command R7B: Fast and Efficient AI Model

Cohere has announced Command R7B, the smallest model in their R series, designed for building AI applications on standard GPUs and edge devices. The new model prioritizes speed, efficiency, and quality while maintaining strong performance. As part of Cohere's product lineup that includes Command, Embed, and Rerank capabilities, R7B aims to make advanced AI more accessible and practical for deployment on commodity hardware.

Anthropic Launches AI Usage Analysis Platform

Anthropic has introduced Clio (Claude insights and observations), a platform that analyzes millions of conversations with AI assistants while preserving user privacy. The system uses AI to identify usage patterns and potential risks without human review of raw conversations. Testing on 1 million Claude.ai conversations revealed key insights about real-world AI use, including high adoption of coding tasks and variations in usage across languages. The platform has already helped improve safety systems by detecting coordinated misuse attempts and monitoring AI behavior during critical events like elections.

Microsoft Launches Phi-4, New Small Language Model Excelling at Math

Microsoft has introduced Phi-4, a 14 billion parameter small language model (SLM) that specializes in complex mathematical reasoning while maintaining strong general language capabilities. Available on Azure AI Foundry under a Microsoft Research License Agreement, Phi-4 outperforms larger models, including Gemini Pro 1.5, on math competition problems. The model showcases advances in efficient AI development through high-quality synthetic datasets, curated organic data, and post-training innovations, while incorporating robust safety features including content filters and real-time monitoring capabilities.

AI Industry Faces Looming Training Data Shortage by 2028

AI researchers warn that large language models are rapidly approaching the limits of available internet text data for training, with projections showing demand will match total supply by 2028. While tech giants like OpenAI and Anthropic seek workarounds through synthetic data generation and private partnerships, the shortage could force a shift from large general-purpose models to smaller specialized ones. The crisis is compounded by increasing data access restrictions from publishers and pending copyright lawsuits, leading researchers to explore alternatives like multi-modal training, proprietary datasets, and more efficient training methods.

Quick Hits ⚡

Harvard’s Dataset Aims to Democratize AI Development: Harvard, supported by Microsoft and OpenAI, releases a public-domain dataset to assist smaller AI developers and promote ethical training resources.
Apple Unveils Image Playground, Genmoji, & ChatGPT: iOS 18.2 introduces groundbreaking AI tools, enhancing image creation, personalized emojis, and on-device privacy.
BRICS Alliance to Rival US in AI: Russia partners with BRICS nations to challenge U.S. dominance, emphasizing AI regulation and market expansion.
Palantir's AI Warp Speed Revamps U.S. Manufacturing: Warp Speed program accelerates production with AI tools and high-profile industry partnerships.
Apple & Broadcom Team Up to Challenge NVIDIA: Developing Baltra, a new AI server chip, to challenge AI chip market dominance by 2026.
AI Mammograms Boost Breast Cancer Detection by 21%: AI aids radiologists in detecting breast cancer, but faces cost and insurance hurdles.
Trump Names David Sacks AI & Crypto Czar: Ex-PayPal COO to lead AI and crypto policy under the Trump administration.
Pentagon Launches AI Office for Military: New AI Rapid Capabilities Cell receives $100 million to accelerate AI military system integration.
China Investigates Nvidia for Anti-Monopoly Violations: Probe examines Nvidia's market influence, highlighting global scrutiny of its business practices and dominance in the GPU market.
Cognition Labs Launches Devin AI Developer Assistant: Devin streamlines development workflows with Slack, GitHub, and IDE integrations for tasks like bug fixes, PR creation, and code refactoring.
Replit Launches Upgraded AI Development Suite: Introducing 'Assistant' and other tools, focusing on project improvements, flexible outputs, and integrated infrastructure, available through a subscription model.
Yelp Unveils New AI Features: LLM-powered Review Insights, AI-optimized ads, and upgraded chatbots to better connect users with services.
AI Voices Misused in Russian Disinformation Campaign: “Operation Undercut” exploits ElevenLabs’ AI voices to spread fake news targeting European support for Ukraine.

Content I Liked 👀

Marc Benioff’s decision to pivot Salesforce toward AI Agents demonstrates bold, visionary leadership. By leveraging AgentForce to boost engineering productivity by 30% and reduce support roles with digital labor, he’s positioning Salesforce at the forefront of enterprise AI.

His rapid shift to focus entirely on AgentForce ahead of Dreamforce underscores the importance of adaptability and innovation. Benioff’s belief in maintaining a “beginner’s mind” and embracing transformative trends is a masterclass in leading through disruption. An inspiring move that showcases how AI can drive both efficiency and value in business. → Watch the full interview here.

AI Fun 🎅

Have a video chat with Santa. Turn on your camera and mic and give it a try. Available in 30 languages.

That’s all for me. See you next week!