New super-powered Claude model "Mythos" leaks as OpenAI pairs back

New super-powered Claude model "Mythos" leaks as OpenAI pairs back
New super-powered Claude model "Mythos" leaks as OpenAI pairs back

🤔 “How do I explain to my mom that AI is involved in global warfare policy?” Share Handy AI with your family and friends to help them understand the crazy world of modern artificial intelligence (and save you some time).

Share Handy AI

last week’s top stories

🔐 Anthropic’s misconfigured CMS exposes Claude Mythos, its most powerful model yet. Security researchers Roy Paz (LayerX) and Alexandre Pauwels (Cambridge) found ~3,000 Anthropic internal files publicly accessible on March 27, including draft blog posts for Claude Mythos (codename: Capybara), a new tier above Opus that Anthropic confirmed scores dramatically higher on coding, reasoning, and cybersecurity. The company’s own documents describe it as “far ahead of any other AI model in cyber capabilities.” Read more

💀 OpenAI shuts down Sora and cancels a $1 billion Disney licensing deal. On March 24, OpenAI announced it was discontinuing the Sora app, Sora.com, and the Sora API, citing unsustainable inference costs estimated at $15M per day against $2.1M in lifetime in-app revenue. Downloads had fallen 66% from their November 2025 peak. Disney, which had committed $1B to a Sora licensing partnership in December, learned of the shutdown less than an hour before the public announcement.

🔞 OpenAI permanently cancels ChatGPT Adult Mode over dataset and legal risk. OpenAI indefinitely paused its planned adult content mode for ChatGPT in early March, with the Financial Times reporting the delay stems from challenges around sexual datasets and eliminating illegal content from training pipelines. The company had teased Adult Mode in late 2025 alongside the Sora launch as part of a broader push toward consumer social products. Both are now cancelled. Read more

⏱️ Anthropic tightens Claude peak-hour sessions; weekly limits stay unchanged. On March 26, engineer Thariq Shihipar confirmed 5-hour session windows burn faster during weekday peak hours (5–11am PT), affecting roughly 7% of Pro and Max subscribers. The adjustment follows a 30%-plus surge in web traffic and a 295% spike in ChatGPT uninstalls. Read more

🦙 Meta’s Llama successor Avocado is reportedly pivoting to closed-source commercial model. Internal Meta documents and CNBC reporting indicate the next flagship LLM, codenamed Avocado, will ship as a closed-source model under Meta Superintelligence Labs led by Chief AI Officer Alexandr Wang, abandoning the open-weight tradition that defined the Llama series. The pivot follows Llama 4’s benchmark manipulation scandal and concerns that open weights accelerated Chinese lab capabilities. Meta spent three years evangelizing open AI, and that position is now being quietly retired. Read more

🏗️ Meta raises El Paso AI data center from $1.5B to $10 billion. On March 26, Meta boosted its El Paso, Texas campus commitment more than sixfold, targeting 1 gigawatt of capacity by 2028 with liquid cooling and 5,000 megawatts of clean energy contracted in Texas. The facility is its third in the state and its 29th globally, arriving as Big Tech collectively prepares to spend $630B on AI infrastructure this year. Read more

📖 Wikipedia editors vote 44-2 to ban LLM-generated article content. On March 20, English Wikipedia’s editorial community formally prohibited using large language models to generate or rewrite article content, updating previous vague guidance that only blocked generating entirely new articles. Two narrow exceptions remain: AI-assisted copyediting of an editor’s own writing (with human review) and draft translation (with fluency verification). Read more

🎮 ARC-AGI-3 launches the week Jensen Huang declares AGI achieved. The ARC Prize Foundation released ARC-AGI-3 on March 25 at Y Combinator HQ, a benchmark of 135 handcrafted turn-based game environments requiring on-the-fly learning with no instructions or stated goals. Humans solve 100% of environments. Frontier models topped out at 0.37% (Gemini 3.1 Pro), 0.26% (GPT-5.4), and 0.25% (Claude Opus 4.6). This dropped the same week Nvidia CEO Jensen Huang told Lex Fridman “I think we’ve achieved AGI.” Read more


🧪 AI Research of the Week

Detection Is Cheap, Routing Is Learned: Why Refusal-Based Alignment Evaluation Fails
From Gregory N. Frank, Independent Researcher

Gregory N. Frank claims that the way researchers test whether AI models are "safe" is largely useless. The standard approach is to ask a model something it shouldn't answer, and check if it refuses. Frank’s paper argues that refusal is just one of several ways a model can suppress information, and modern models increasingly don't bother with it.

Frank studied nine Chinese-origin AI models across political censorship tasks and found every model could detect sensitive topics perfectly, as expected. The important part happens after this detection, when the model decides what to do with what it found. Some models give you a flat refusal. Others quietly steer the answer in a safe direction without ever telling you they're doing it (same question, no refusal, different output).

Frank’s ablation experiments, which involve surgically removing specific directions from a model's internal activations, showed that in most models you can toggle censorship on and off like a switch without touching the model's actual knowledge. The routing layer between "I know this is sensitive" and "here's what I say about it" is where alignment actually lives, and it seems as though no current safety benchmarks measure it at all.


and then, even more news…

🔥 Dario Amodei escalated OpenAI war, calling Brockman donation “evil,” deal “safety theater.” In a leaked 1,600-word Slack memo from late February and subsequent internal messages, Anthropic CEO Dario Amodei called Greg Brockman’s $25M MAGA Inc donation “evil,” described OpenAI’s Pentagon contract as “20% real and 80% safety theater,” and compared OpenAI to “tobacco companies selling products they know are harmful.” Read more

Read more