Redwood Research Blog Podcast Republic

Redwood Research Blog

By Redwood Research

Listen to a podcast, please open Podcast Republic app. Available on Google Play Store and Apple App Store.

Image by Redwood Research

Open Website

Rate for this podcast

Subscribers: 0
Reviews: 0
Episodes: 37

Description

Narrations of Redwood Research blog posts. Redwood Research is a research nonprofit based in Berkeley. We investigate risks posed by the development of powerful artificial intelligence and techniques for mitigating those risks.

Episode	Date
“The case for countermeasures to memetic spread of misaligned values” by Alex Mallen Read the full episode description	May 28, 2025
“AIs at the current capability level may be important for future safety work” by Ryan Greenblatt Read the full episode description	May 12, 2025
“Misalignment and Strategic Underperformance: An Analysis of Sandbagging and Exploration Hacking” by Julian Stastny, Buck Shlegeris Read the full episode description	May 08, 2025
“Training-time schemers vs behavioral schemers” by Alex Mallen Read the full episode description	May 06, 2025
“What’s going on with AI progress and trends? (As of 5/2025)” by Ryan Greenblatt Read the full episode description	May 03, 2025
“How can we solve diffuse threats like research sabotage with AI control?” by Vivek Hebbar Read the full episode description	Apr 30, 2025
“7+ tractable directions in AI control” by Ryan Greenblatt Read the full episode description	Apr 29, 2025
“Clarifying AI R&D threat models” by Josh Clymer Read the full episode description	Apr 25, 2025
“How training-gamers might function (and win)” by Vivek Hebbar Read the full episode description	Apr 24, 2025
“Handling schemers if shutdown is not an option” by Buck Shlegeris Read the full episode description	Apr 18, 2025
“Ctrl-Z: Controlling AI Agents via Resampling” by Buck Shlegeris Read the full episode description	Apr 16, 2025
“To be legible, evidence of misalignment probably has to be behavioral” by Ryan Greenblatt Read the full episode description	Apr 15, 2025
“Why do misalignment risks increase as AIs get more capable?” by Ryan Greenblatt Read the full episode description	Apr 11, 2025
“An overview of areas of control work” by Ryan Greenblatt Read the full episode description	Apr 09, 2025
“An overview of control measures” by Ryan Greenblatt Read the full episode description	Apr 06, 2025
“Buck on the 80,000 Hours podcast” by Buck Shlegeris Read the full episode description	Apr 05, 2025
“Notes on countermeasures for exploration hacking (aka sandbagging)” by Ryan Greenblatt Read the full episode description	Apr 04, 2025
“Notes on handling non-concentrated failures with AI control: high level methods and different regimes” by Ryan Greenblatt Read the full episode description	Apr 03, 2025
“Prioritizing threats for AI control” by Ryan Greenblatt Read the full episode description	Mar 19, 2025
“How might we safely pass the buck to AI?” by Josh Clymer Read the full episode description	Feb 19, 2025
“Takeaways from sketching a control safety case” by Josh Clymer Read the full episode description	Jan 30, 2025
“Planning for Extreme AI Risks” by Josh Clymer Read the full episode description	Jan 29, 2025
“Ten people on the inside” by Buck Shlegeris Read the full episode description	Jan 28, 2025
“When does capability elicitation bound risk?” by Josh Clymer Read the full episode description	Jan 22, 2025
“How will we update about scheming?” by Ryan Greenblatt Read the full episode description	Jan 19, 2025
“Thoughts on the conservative assumptions in AI control” by Buck Shlegeris Read the full episode description	Jan 17, 2025
“Extending control evaluations to non-scheming threats” by Josh Clymer Read the full episode description	Jan 13, 2025
“Measuring whether AIs can statelessly strategize to subvert security measures” by Buck Shlegeris, Alex Mallen Read the full episode description	Dec 20, 2024
“Alignment Faking in Large Language Models” by Ryan Greenblatt, Buck Shlegeris Read the full episode description	Dec 18, 2024
“Why imperfect adversarial robustness doesn’t doom AI control” by Buck Shlegeris Read the full episode description	Nov 18, 2024
“Win/continue/lose scenarios and execute/replace/audit protocols” by Buck Shlegeris Read the full episode description	Nov 15, 2024
“Behavioral red-teaming is unlikely to produce clear, strong evidence that models aren’t scheming” by Buck Shlegeris Read the full episode description	Oct 10, 2024
“A basic systems architecture for AI agents that do autonomous research” by Buck Shlegeris Read the full episode description	Sep 26, 2024
“How to prevent collusion when using untrusted models to monitor each other” by Buck Shlegeris Read the full episode description	Sep 25, 2024
“Would catching your AIs trying to escape convince AI developers to slow down or undeploy?” by Buck Shlegeris Read the full episode description	Aug 26, 2024
“Fields that I reference when thinking about AI takeover prevention” by Buck Shlegeris Read the full episode description	Aug 13, 2024
“Getting 50% (SoTA) on ARC-AGI with GPT-4o” by Ryan Greenblatt Read the full episode description	Jun 17, 2024

Redwood Research Blog

By Redwood Research

Category: Technology

Open in Apple Podcasts

Open RSS feed

Open Website

Description