The AI industry lurches forward with GPT-5.5, GPT Image 2, and DeepSeek V4; plus, an Anthropic Mythos leak

🤔 “Should I even bother trying out DeepSeek, or stick with ChatGPT?” Share Handy AI with your coworkers and friends to help them understand the crazy world of modern artificial intelligence (and save you some time).
what to know for now
🧠 GPT-5.5 takes the intelligence crown and the hallucination one too. OpenAI shipped GPT-5.5 to ChatGPT and Codex with default, Thinking, and Pro variants, a 400K context window inside Codex, and roughly 40% better token efficiency. The API followed on April 24 at $5 in / $30 out per million tokens, with GPT-5.5 Pro at $30 / $180. Artificial Analysis ranked it #1 by 3 points on the Intelligence Index, but it also clocks an 86% hallucination rate against Claude Opus 4.7’s 36%. Confident, smart, and willing to lie to your face.
🛡️ A group accessed Anthropic’s Mythos through a contractor. A third-party contractor used legitimate vendor credentials to breach the protected environment around Mythos, Anthropic’s restricted cybersecurity model, and opened access to a small group of colleagues who proceeded to use it to build websites. Anthropic told TechCrunch it’s investigating but maintains its own systems weren’t impacted. Read more
🐳 DeepSeek V4 lands and resets the open-weight ceiling again. DeepSeek dropped a preview of V4 on April 24: V4-Pro at 1.6T total / 49B active parameters, V4-Flash at 284B / 13B, both with a 1M context window and dual Thinking / Non-Thinking modes. It runs marginally behind GPT-5.4 and Gemini 3.1 Pro, costs almost nothing, and the weights are already on Hugging Face with native Huawei chip support.
🇨🇳 The White House is calling Chinese distillation “industrial-scale.” OSTP issued a memo on titled “Adversarial Distillation of American AI Models” (NSTM-4), accusing DeepSeek, Moonshot, and MiniMax of running 24,000 fraudulent accounts to extract roughly 16 million interactions from Claude. The State Department followed on April 25 with a global directive warning allies. Distillation has been an open secret for two years. Read more
🖼️ GPT Image 2 ships the best text-rendering image model and the best disinformation engine in the same release. OpenAI shipped Images 2.0 on April 21 with 2K native resolution, 19-out-of-20 legible text on the first try, and big multilingual jumps in Japanese, Korean, Chinese, Hindi, and Bengali. It claimed the #1 spot across every Image Arena category within 12 hours, by the largest margin ever recorded on the leaderboard. It’s excellent. That’s the problem. Read more
💸 Big Tech is now spending $226,000 a day lobbying Congress. Issue One’s Q1 2026 analysis pegs combined lobbying spend from Alphabet, Meta, Microsoft, Nvidia, Anthropic, OpenAI, and four others at $20 million in 90 days, with Meta alone burning $7.1 million ($80K/day). Anthropic quadrupled its lobbying year over year to $1.56 million. OpenAI nearly doubled to $1.02 million. The 307 lobbyists they collectively employed in Q1 outnumber every state’s congressional delegation except California’s. Read more
🤖 ChatGPT Codex update + workspace agents. Codex got a major upgrade alongside GPT-5.5: GPT-5.5 inside the editor with a 400K context, multi-agent v2 with sub-agents addressed at paths like /root/agent_a, and structured inter-agent messaging. OpenAI also rolled out ChatGPT workspace agents for teams that handle long-running workflows across tools (free through May 6). Read more
🔬 Google launched two fully-autonomous Deep Research agents. Google shipped Deep Research and Deep Research Max on April 21, both running on Gemini 3.1 Pro through the Gemini API. They fuse open web data with private enterprise data in a single call, generate native charts and infographics, and pull in arbitrary third-party sources via MCP. Gemini’s research mode was already the best one out there. Read more
🎨 Claude Design turned prompts into design files, and Figma’s stock fell 7%. Anthropic launched Claude Design on April 17, an Anthropic Labs product powered by Opus 4.7 that produces editable visual work, prototypes, and pitch decks from a conversation. It reads your codebase and design files during onboarding to build a design system, then exports to Canva, PDF, PPTX, or standalone HTML. Anthropic is betting the way to hijack design tooling isn’t to clone Figma. It’s to skip it. Read more
🎭 Claude Live Artifacts make Cowork dashboards stay alive. Anthropic shipped Live Artifacts inside the Cowork orchestration layer: persistent, data-connected dashboards and trackers that refresh from their source connectors every time you open them. A request that used to need a data engineer and a sprint can now start as a prompt and end as a dashboard you reuse for months. Read more
🧪 AI Research of the Week
Spatial Metaphors for LLM Memory: A Critical Analysis of the MemPalace Architecture
OpenHub Research
Jake’s Take: MemPalace is an open-source long-term memory system for AI agents that organizes a chat history the way medieval monks organized facts (literally): as a memory palace, with people and projects as wings, topics as rooms, and conversation snippets as drawers. It blew up the past two weeks (47,000 GitHub stars in 14 days) and posts 96.6% Recall@5 on the LongMemEval benchmark while needing zero LLM calls to write to memory. This paper is the first independent audit.
The authors replicate the benchmarks, decompose the system, and conclude that most of MemPalace’s recall win comes from storing conversations verbatim and pairing them with a stock embedding model, not from the spatial metaphor itself.
what to know for later
🚀 SpaceX bought a $10B collaboration with Cursor and an option to acquire it for $60B. SpaceX announced that it’s paying Cursor $10 billion to develop coding and knowledge-work AI on the Colossus supercomputer, with a $60 billion buyout option that triggers after SpaceX’s summer IPO. The offer preempted Cursor’s in-progress $2 billion fundraise, and Microsoft was reportedly looking at the same target. Read more