Anthropic's new Claude vs OpenAI's GPT-4.5

March 3, 2025

what to know for now

🧠 Anthropic’s new Claude improves reasoning. Claude 3.7 Sonnet introduces "hybrid reasoning," with particularly dominant performance in coding, math, and finance. Alongside, Anthropic launched a research preview of Claude Code, an AI coding assistant capable of editing files, running tests, and pushing code to GitHub. Read more

🤖 GPT-4.5 prioritizes “vibes” over performance. OpenAI’s GPT-4.5 improves conversational flow, reduces hallucinations, and expands world knowledge. While many users claim the model “feels” more intelligent, 4.5 falls short of newer foundational models on benchmarks. Pro users have immediate access, with wider availability coming next week. Read more

🗣 Amazon launches AI-powered Alexa+. Amazon introduced Alexa+, an AI-enhanced version of its voice assistant, designed for more natural conversations and expanded functionality, including home management, reservations, and product ordering. The service, free for Prime members and $19.99/month otherwise, will roll out in phases, prioritizing select Echo devices. Read more

🩺 Microsoft launches AI clinical assistant. Dragon Copilot integrates voice dictation, ambient listening, and generative AI to streamline documentation and automate tasks for clinicians. The tool unifies Microsoft’s healthcare AI products, allowing doctors to dictate notes, generate referral letters, and access medical records and external sources like the CDC. Read more

🧪 AI Research of the Week

LLM-Microscope
From AIRI, others
Jake’s Take: This paper introduces LLM-Microscope, a toolkit designed to analyze how Large Language Models encode and store contextual information. The study challenges the assumption that stopwords and punctuation are insignificant, showing that these elements play a critical role in maintaining context. Removing such "filler" tokens significantly degrades model performance on tasks requiring long-context understanding.
This work is a reminder that modern LLM behavior is still a bit of a black box. Treating "trivial" tokens (like periods and other punctuation) as disposable is a mistake, and lessens the effectiveness of prompts.

what to know for later

🧠 Siri's AI evolution delayed. Apple planned a major AI-driven Siri upgrade, but internal sources say a truly conversational Siri may not arrive until iOS 20 in 2027. The upcoming iOS 18.5 update will include an LLM-powered Siri, but it will operate separately from the current version. Apple faces challenges in merging these models, securing AI training hardware, and retaining talent amid leadership struggles and competition. Read more

🗣 AI voice too real for comfort. Sesame’s new AI assistant, designed to mimic human conversation with eerie accuracy, impresses. One journalist’s experience claims the AI’s voice unexpectedly resembled an old friend, making interactions unsettlingly personal. The model continues ethical concerns about emotional manipulation and deepfake potential. Read more

💰 DeepSeek claims 545% AI profit margin. The company reported a theoretical 545% margin over a 24-hour inference run, with revenue estimates reaching $562,027 against $87,072 in operational costs. This was based on processing 608 billion input tokens and 168 billion output tokens using its V3 and R1 models. However, actual revenue may be lower due to discounted pricing and free services. Read more

📱 Honor commits $10B to AI devices. Chinese smartphone maker Honor will invest $10 billion in AI development over five years, expanding beyond smartphones into AI-powered PCs, tablets, and wearables. The move aligns with its preparations for a public listing and comes amid intense AI investment in China, particularly around DeepSeek’s low-cost language models. Read more