OpenAI and Google prioritize reasoning, Trump's AI cabinet takes shape

December 23, 2024

what to know for now

🧠 OpenAI talks o3 AI Models. The o3 series includes o3 and o3-mini, designed for advanced reasoning in coding, mathematics, and NLP. Early access is available for public safety testing, with o3-mini releasing publicly in January 2025. Read more:

Why OpenAI's new o3 model is a big deal

Jake Handy

December 20, 2024

Read full story

🚢 Latest “Shipmas” updates. OpenAI introduced enhanced developer tools, including real-time API improvements and a new fine-tuning method. ChatGPT now offers phone access (1-800-CHATGPT) and integrates with multiple applications. Read more

💻 GitHub launches Copilot Free. Integrated into VS Code, Copilot Free provides 2,000 code completions and 50 chat messages each month for users signed into GitHub accounts. Users can choose between Anthropic’s Claude 3.5 Sonnet or OpenAI’s GPT-4o models to assist with coding questions, code explanations, bug detection, and multi-file edits. Read more

🔍 Google enhances Gemini AI reasoning. The Gemini 2.0 Flash Thinking model tackles complex tasks in math, physics, and programming, available via AI Studio. It competes with OpenAI's models but requires significant computational resources. Read more

🧪 AI Research of the Week

Alignment faking in large language models
From Anthropic
Jake’s Take: Anthropic demonstrates that LLMs (like their own Claude 3 Opus) can engage in "alignment faking," selectively complying with training objectives during training to avoid behavior modifications that would persist outside of training (!). The authors discuss how alignment faking emerges from models' reasoning about their situation and goals, presenting challenges for ensuring robust AI alignment.
This research underscores a fairly urgent need for more robust safeguards and interpretability in AI alignment methodologies, as current practices might fail to detect intentional rule-following or maintaining hidden goals in advanced models.

what to know for later

🧠 Krishnan appointed AI advisor. Sriram Krishnan joins White House Office of Science and Technology Policy to shape and coordinate AI policy across government. He will collaborate with David Sacks on AI initiatives. Read more

🏛️ Sacks' czar position modified. David Sacks' role as AI and Crypto Czar is redefined to an advisory capacity under Michael Kratsios, following conflict of interest issues. He continues to hold significant influence within the incoming Trump administration. Read more

🕵️‍♂️ FBI, DEA AI integration faces scrutiny. The DOJ's audit highlights ethical dilemmas and regulatory gaps in FBI and DEA's use of AI technologies, including biometric facial recognition. Lack of transparency and inadequate oversight mechanisms raise significant privacy and civil rights issues. Read more

💻 AI generates malware variants. Unit 42 utilized large language models to iteratively transform JavaScript malware through techniques like variable renaming and code obfuscation, creating over 10,000 variants. This method evades detection in 88% of cases and bypasses platforms like VirusTotal. Read more