The Nonlinear Library: Alignment Forum Top Posts Podcast Republic

The Nonlinear Library: Alignment Forum Top Posts

By The Nonlinear Fund

Listen to a podcast, please open Podcast Republic app. Available on Google Play Store and Apple App Store.

Image by The Nonlinear Fund

Category: Education

Open in Apple Podcasts

Open RSS feed

Open Website

Rate for this podcast

Subscribers: 0
Reviews: 0
Episodes: 242

Description

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.

Episode	Date
Discussion with Eliezer Yudkowsky on AGI interventions by Rob Bensinger, Eliezer Yudkowsky Read the full episode description	Dec 10, 2021
What failure looks like by Paul Christiano Read the full episode description	Dec 10, 2021
The Parable of Predict-O-Matic by Abram Demski Read the full episode description	Dec 10, 2021
What 2026 looks like by Daniel Kokotajlo Read the full episode description	Dec 10, 2021
Are we in an AI overhang? by Andy Jones Read the full episode description	Dec 10, 2021
DeepMind: Generally capable agents emerge from open-ended play by Daniel Kokotajlo Read the full episode description	Dec 10, 2021
Alignment Research Field Guide by Abram Demski Read the full episode description	Dec 10, 2021
Hiring engineers and researchers to help align GPT-3 by Paul Christiano Read the full episode description	Dec 10, 2021
2018 AI Alignment Literature Review and Charity Comparison by Larks Read the full episode description	Dec 10, 2021
Another (outer) alignment failure story by Paul Christiano Read the full episode description	Dec 10, 2021
Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More by Ben Pace Read the full episode description	Dec 10, 2021
Some AI research areas and their relevance to existential safety by Andrew Critch Read the full episode description	Dec 10, 2021
Announcing the Alignment Research Center by Paul Christiano Read the full episode description	Dec 10, 2021
The Rocket Alignment Problem by Eliezer Yudkowsky Read the full episode description	Dec 10, 2021
The case for aligning narrowly superhuman models by Ajeya Cotra Read the full episode description	Dec 10, 2021
Realism about rationality by Richard Ngo Read the full episode description	Dec 10, 2021
Birds, Brains, Planes, and AI: Against Appeals to the Complexity/Mysteriousness/Efficiency of the Brain by Daniel Kokotajlo Read the full episode description	Dec 10, 2021
Goodhart Taxonomy by Scott Garrabrant Read the full episode description	Dec 10, 2021
The ground of optimization by Alex Flint Read the full episode description	Dec 10, 2021
An overview of 11 proposals for building safe advanced AI by Evan Hubinger Read the full episode description	Dec 10, 2021
Chris Olah’s views on AGI safety by Evan Hubinger Read the full episode description	Dec 10, 2021
Draft report on AI timelines by Ajeya Cotra Read the full episode description	Dec 10, 2021
An Untrollable Mathematician Illustrated by Abram Demski Read the full episode description	Dec 10, 2021
Radical Probabilism by Abram Demski Read the full episode description	Dec 10, 2021
What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs) by Andrew Critch Read the full episode description	Dec 10, 2021
Utility Maximization = Description Length Minimization by johnswentworth Read the full episode description	Dec 10, 2021
Risks from Learned Optimization: Introduction by Evan Hubinger, Chris van Merwijk, Vladimir Mikulik, Joar Skalse, Scott Garrabrant Read the full episode description	Dec 10, 2021
Matt Botvinick on the spontaneous emergence of learning algorithms by Adam Scholl Read the full episode description	Dec 10, 2021
the scaling "inconsistency": openAI’s new insight by nostalgebraist Read the full episode description	Dec 10, 2021
Introduction to Cartesian Frames by Scott Garrabrant Read the full episode description	Dec 10, 2021
My research methodology by Paul Christiano Read the full episode description	Dec 10, 2021
Fun with +12 OOMs of Compute by Daniel Kokotajlo Read the full episode description	Dec 10, 2021
Seeking Power is Often Convergently Instrumental in MDPs by Paul Christiano Read the full episode description	Dec 10, 2021
The Solomonoff Prior is Malign by Mark Xu Read the full episode description	Dec 10, 2021
2020 AI Alignment Literature Review and Charity Comparison by Larks Read the full episode description	Dec 10, 2021
Inner Alignment: Explain like I'm 12 Edition by Rafael Harth Read the full episode description	Dec 10, 2021
Evolution of Modularity by johnswentworth Read the full episode description	Dec 10, 2021
MIRI comments on Cotra's "Case for Aligning Narrowly Superhuman Models" by Rob Bensinger Read the full episode description	Dec 10, 2021
EfficientZero: human ALE sample-efficiency w/MuZero+self-supervised by gwern Read the full episode description	Dec 10, 2021
Understanding “Deep Double Descent” by Evan Hubinger Read the full episode description	Dec 10, 2021
Can you control the past? by Joe Carlsmith Read the full episode description	Dec 10, 2021
Developmental Stages of GPTs by orthonormal Read the full episode description	Dec 10, 2021
My computational framework for the brain by Steve Byrnes Read the full episode description	Dec 10, 2021
Redwood Research’s current project by Buck Shlegeris Read the full episode description	Dec 10, 2021
2019 AI Alignment Literature Review and Charity Comparison by Larks Read the full episode description	Dec 10, 2021
Testing The Natural Abstraction Hypothesis: Project Intro by johnswentworth Read the full episode description	Dec 10, 2021
The theory-practice gap by Buck Shlegeris by Buck Shlegeris Read the full episode description	Dec 10, 2021
Selection vs Control by Abram Demski Read the full episode description	Dec 10, 2021
Why Subagents? by johnswentworth Read the full episode description	Dec 10, 2021
Possible takeaways from the coronavirus pandemic for slow AI takeoff by Vika Read the full episode description	Dec 10, 2021
Embedded Agency (full-text version) by Scott Garrabrant, Abram Demski Read the full episode description	Dec 10, 2021
Cortés, Pizarro, and Afonso as Precedents for Takeover by Daniel Kokotajlo Read the full episode description	Dec 10, 2021
Opinions on Interpretable Machine Learning and 70 Summaries of Recent Papers by lifelonglearner, Peter Hase Read the full episode description	Dec 10, 2021
Disentangling arguments for the importance of AI safety by Richard Ngo Read the full episode description	Dec 10, 2021
A Semitechnical Introductory Dialogue on Solomonoff Induction by Eliezer Yudkowsky Read the full episode description	Dec 10, 2021
Thoughts on Human Models by Ramana Kumar, Scott Garrabrant Read the full episode description	Dec 06, 2021
AI Alignment 2018-19 Review by Rohin Shah Read the full episode description	Dec 06, 2021
Paul's research agenda FAQ by Alex Zhu Read the full episode description	Dec 06, 2021
Forecasting Thread: AI TimelinesQ by Amanda Ngo, Daniel Kokotajlo, Ben Pace Read the full episode description	Dec 06, 2021
An Orthodox Case Against Utility Functions by Abram Demski Read the full episode description	Dec 06, 2021
Saving Time by Scott Garrabrant Read the full episode description	Dec 06, 2021
Beyond Astronomical Waste by Wei Dai Read the full episode description	Dec 06, 2021
interpreting GPT: the logit lens by nostalgebraist Read the full episode description	Dec 06, 2021
Full-time AGI Safety! by Steve Byrnes Read the full episode description	Dec 06, 2021
AMA: Paul Christiano, alignment researcher by Paul Christiano Read the full episode description	Dec 06, 2021
Inner Alignment in Salt-Starved Rats by Steve Byrnes Read the full episode description	Dec 06, 2021
Against GDP as a metric for timelines and takeoff speeds by Daniel Kokotajlo Read the full episode description	Dec 06, 2021
Soft takeoff can still lead to decisive strategic advantage by Daniel Kokotajlo Read the full episode description	Dec 06, 2021
My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda by Chi Nguyen Read the full episode description	Dec 06, 2021
An Intuitive Guide to Garrabrant Induction by Mark Xu Read the full episode description	Dec 06, 2021
Prisoners' Dilemma with Costs to Modeling by Scott Garrabrant Read the full episode description	Dec 06, 2021
How much chess engine progress is about adapting to bigger computers? by Paul Christiano Read the full episode description	Dec 06, 2021
Debate update: Obfuscated arguments problem by Beth Barnes Read the full episode description	Dec 06, 2021
The Fusion Power Generator Scenario by johnswentworth Read the full episode description	Dec 05, 2021
The Alignment Problem: Machine Learning and Human ValuesRobust Delegation by Rohin Shah Read the full episode description	Dec 05, 2021
The Commitment Races problem by Daniel Kokotajlo Read the full episode description	Dec 05, 2021
What I’ll be doing at MIRI by Evan Hubinger Read the full episode description	Dec 05, 2021
Problem relaxation as a tactic by Alex Turner Read the full episode description	Dec 05, 2021
Zero Sum is a misnomer by Abram Demski Read the full episode description	Dec 05, 2021
Challenges to Christiano’s capability amplification proposal by Eliezer Yudkowsky Read the full episode description	Dec 05, 2021
AI Safety Success Stories by Wei Dai Read the full episode description	Dec 05, 2021
My current framework for thinking about AGI timelines by Alex Zhu Read the full episode description	Dec 05, 2021
Book review: "A Thousand Brains" by Jeff Hawkins, Steve Byrnes Read the full episode description	Dec 05, 2021
The date of AI Takeover is not the day the AI takes over by Daniel Kokotajlo Read the full episode description	Dec 05, 2021
How do we prepare for final crunch time?Q by Eli Tyre Read the full episode description	Dec 05, 2021
Alignment By Default by johnswentworth Read the full episode description	Dec 05, 2021
Fixing The Good Regulator Theorem by johnswentworth Read the full episode description	Dec 05, 2021
Call for research on evaluating alignment (funding + advice available) by Beth Barnes Read the full episode description	Dec 05, 2021
Can you get AGI from a Transformer? by Steve Byrnes Read the full episode description	Dec 05, 2021
Measuring hardware overhang by hippke Read the full episode description	Dec 05, 2021
Welcome & FAQ! by Ruben Bloom, Oliver Habryka Read the full episode description	Dec 05, 2021
Robustness to Scale by Scott Garrabrant Read the full episode description	Dec 05, 2021
What can the principal-agent literature tell us about AI risk? Read the full episode description	Dec 05, 2021
Our take on CHAI’s research agenda in under 1500 words by Alex Flint Read the full episode description	Dec 05, 2021
Alignment As A Bottleneck To Usefulness Of GPT-3 by johnswentworth Read the full episode description	Dec 05, 2021
AGI safety from first principles: Introduction by Richard Ngo Read the full episode description	Dec 05, 2021
Less Realistic Tales of Doom by Mark Xu Read the full episode description	Dec 05, 2021
AI and Compute trend isn't predictive of what is happening by alexlyzhov Read the full episode description	Dec 05, 2021
Towards a New Impact Measure by Alex Turner Read the full episode description	Dec 05, 2021
Utility ≠ Reward by Vladimir Mikulik Read the full episode description	Dec 05, 2021
Knowledge Neurons in Pretrained Transformers by Evan Hubinger Read the full episode description	Dec 05, 2021
Comprehensive AI Services as General Intelligence by Rohin Shah Read the full episode description	Dec 05, 2021
List of resolved confusions about IDA by Wei Dai Read the full episode description	Dec 05, 2021
Announcement: AI alignment prize round 3 winners and next round by Vladimir Slepnev Read the full episode description	Dec 05, 2021
Frequent arguments about alignment by John Schulman Read the full episode description	Dec 05, 2021
Apply to the ML for Alignment Bootcamp (MLAB) in Berkeley [Jan 3 - Jan 22] by Oliver Habryka, Buck Shlegeris Read the full episode description	Dec 05, 2021
Alignment Newsletter One Year Retrospective by Rohin Shah Read the full episode description	Dec 05, 2021
Formal Inner Alignment, Prospectus by Abram Demski Read the full episode description	Dec 05, 2021
Writeup: Progress on AI Safety via Debate by Beth Barnes, Paul Christiano Read the full episode description	Dec 05, 2021
Clarifying inner alignment terminology by Evan Hubinger Read the full episode description	Dec 05, 2021
A Critique of Functional Decision Theory by wdmacaskill Read the full episode description	Dec 05, 2021
Experimentally evaluating whether honesty generalizes by Paul Christiano Read the full episode description	Dec 05, 2021
History of the Development of Logical Induction by Scott Garrabrant Read the full episode description	Dec 05, 2021
Optimization Amplifies by Scott Garrabrant Read the full episode description	Dec 05, 2021
Introducing the AI Alignment Forum (FAQ) by Oliver Habryka, Ben Pace, Raymond Arnold, Jim Babcock Read the full episode description	Dec 05, 2021
Ought: why it matters and ways to help by Paul Christiano Read the full episode description	Dec 05, 2021
Coherence arguments imply a force for goal-directed behavior by KatjaGrace Read the full episode description	Dec 05, 2021
Request for proposals for projects in AI alignment that work with deep learning systems by abergal, Nick_Beckstead Read the full episode description	Dec 05, 2021
A very crude deception eval is already passed by Beth Barnes Read the full episode description	Dec 05, 2021
Comments on Carlsmith's “Is power-seeking AI an existential risk?” by Nate Soares Read the full episode description	Dec 05, 2021
Counterfactual Mugging Poker Game by Scott Garrabrant Read the full episode description	Dec 05, 2021
Embedded Curiosities by Scott Garrabrant, Abram Demski Read the full episode description	Dec 05, 2021
Collection of GPT-3 results by Kaj Sotala Read the full episode description	Dec 05, 2021
The alignment problem in different capability regimes by Buck Shlegeris Read the full episode description	Dec 05, 2021
Thinking About Filtered Evidence Is (Very!) Hard by Abram Demski Read the full episode description	Dec 04, 2021
Demons in Imperfect Search Read the full episode description	Dec 04, 2021
Homogeneity vs. heterogeneity in AI takeoff scenarios by Evan Hubinger Read the full episode description	Dec 04, 2021
Homogeneity vs. heterogeneity in AI takeoff scenarios by Evan Hubinger Read the full episode description	Dec 04, 2021
Extrapolating GPT-N performance Read the full episode description	Dec 04, 2021
Agency in Conway’s Game of Life by Alex Flint Read the full episode description	Dec 04, 2021
Coherence arguments do not imply goal-directed behavior Read the full episode description	Dec 04, 2021
Search versus design by Alex Flint Read the full episode description	Dec 04, 2021
Clarifying “What failure looks like” by Sam Clarke Read the full episode description	Dec 04, 2021
Reward Is Not Enough by Steve Byrnes Read the full episode description	Dec 04, 2021
Toward a New Technical Explanation of Technical Explanation by Abram Demski Read the full episode description	Dec 04, 2021
Gradient hacking by Evan Hubinger Read the full episode description	Dec 04, 2021
Inaccessible information by Paul Christiano Read the full episode description	Dec 04, 2021
Zoom In: An Introduction to Circuits by Evan Hubinger Read the full episode description	Dec 04, 2021
Plausible cases for HRAD work, and locating the crux in the realism about rationality debate by Issa Rice Read the full episode description	Dec 04, 2021
Introduction To The Infra-Bayesianism Sequence by Diffractor Read the full episode description	Dec 04, 2021
Jitters No Evidence of Stupidity in RL Read the full episode description	Dec 04, 2021
Testing The Natural Abstraction Hypothesis: Project Update by johnswentworth Read the full episode description	Dec 04, 2021
Sources of intuitions and data on AGI by Scott Garrabrant Read the full episode description	Dec 04, 2021
What counts as defection? by Alex Turner Read the full episode description	Dec 04, 2021
Competition: Amplify Rohin’s Prediction on AGI researchers & Safety Concerns by Andreas Stuhlmüller Read the full episode description	Dec 04, 2021
Modelling Transformative AI Risks (MTAIR) Project: Introduction by David Manheim, Aryeh Englander Read the full episode description	Dec 04, 2021
How DeepMind's Generally Capable Agents Were Trained by 1a3orn Read the full episode description	Dec 04, 2021
Open question: are minimal circuits daemon-free? by Paul Christiano Read the full episode description	Dec 04, 2021
In Logical Time, All Games are Iterated Games by Abram Demski Read the full episode description	Dec 04, 2021
Gradations of Inner Alignment Obstacles by Abram Demski Read the full episode description	Dec 04, 2021
Truthful AI: Developing and governing AI that does not lie by Owain Evans, Owen Cotton-Barratt, Lukas Finnveden Read the full episode description	Dec 04, 2021
Learning the prior by Paul Christiano Read the full episode description	Dec 04, 2021
Thoughts on the Alignment Implications of Scaling Language Models by leogao Read the full episode description	Dec 04, 2021
Bayesian Probability is for things that are Space-like Separated from You by Scott Garrabrant Read the full episode description	Dec 04, 2021
The Inner Alignment Problem by Evan Hubinger, Chris van Merwijk, Vladimir Mikulik, Joar Skalse, Scott Garrabrant Read the full episode description	Dec 04, 2021
Clarifying some key hypotheses in AI alignment by Ben Cottier, Rohin Shah Read the full episode description	Dec 04, 2021
Reflections on Larks’ 2020 AI alignment literature review by Alex Flint Read the full episode description	Dec 04, 2021
Intermittent Distillations #4: Semiconductors, Economics, Intelligence, and Technological Progress by Mark Xu Read the full episode description	Dec 04, 2021
Humans Are Embedded Agents Too by johnswentworth Read the full episode description	Dec 04, 2021
Reply to Paul Christiano on Inaccessible Information by Alex Flint Read the full episode description	Dec 04, 2021
The Main Sources of AI Risk? by Daniel Kokotajlo, Wei Dai Read the full episode description	Dec 04, 2021
And the AI would have got away with it too, if... by Stuart Armstrong Read the full episode description	Dec 04, 2021
Inner alignment in the brain by Steve Byrnes Read the full episode description	Dec 04, 2021
Review of Soft Takeoff Can Still Lead to DSA by Daniel Kokotajlo Read the full episode description	Dec 04, 2021
Two Neglected Problems in Human-AI Safety by Wei Dai Read the full episode description	Dec 04, 2021
Announcement: AI alignment prize round 4 winners by Vladimir Slepnev Read the full episode description	Dec 04, 2021
The Credit Assignment Problem by Abram Demski Read the full episode description	Dec 04, 2021
Recent Progress in the Theory of Neural Networks by interstice Read the full episode description	Dec 04, 2021
Tessellating Hills: a toy model for demons in imperfect search by DaemonicSigil Read the full episode description	Dec 04, 2021
What Failure Looks Like: Distilling the Discussion by Ben Pace Read the full episode description	Dec 04, 2021
Commentary on AGI Safety from First Principles by Richard Ngo Read the full episode description	Dec 04, 2021
Imitative Generalisation (AKA 'Learning the Prior') by Beth Barnes Read the full episode description	Dec 04, 2021
AI Safety Papers: An App for the TAI Safety Database by Ozzie Gooen Read the full episode description	Dec 04, 2021
On Solving Problems Before They Appear: The Weird Epistemologies of Alignment by Adam Shimi Read the full episode description	Dec 04, 2021
Comments on OpenPhil's Interpretability RFP by Paul Christiano Read the full episode description	Dec 04, 2021
Why I'm excited about Redwood Research's current project by Paul Christiano Read the full episode description	Dec 04, 2021
Worrying about the Vase: Whitelisting by Alex Turner Read the full episode description	Dec 04, 2021
Troll Bridge by Abram Demski Read the full episode description	Dec 04, 2021
Misconceptions about continuous takeoff by Matthew Barnett Read the full episode description	Dec 04, 2021
Updates and additions to Embedded Agency by Rob Bensinger, Abram Demski Read the full episode description	Dec 04, 2021
Selection Theorems: A Program For Understanding Agents by johnswentworth Read the full episode description	Dec 04, 2021
My take on Vanessa Kosoy's take on AGI safety by Steve Byrnes Read the full episode description	Dec 04, 2021
How special are human brains among animal brains? by Alex Zhu Read the full episode description	Dec 04, 2021
How uniform is the neocortex? by Alex Zhu Read the full episode description	Dec 04, 2021
How "honest" is GPT-3?Q by Abram Demski Read the full episode description	Dec 04, 2021
Bottle Caps Aren't Optimisers by DanielFilan Read the full episode description	Dec 04, 2021
Deceptive Alignment by Evan Hubinger, Chris van Merwijk, Vladimir Mikulik, Joar Skalse, Scott Garrabrant Read the full episode description	Dec 04, 2021
Problems in AI Alignment that philosophers could potentially contribute to by Wei Dai Read the full episode description	Dec 04, 2021
Public Static: What is Abstraction? by johnswentworth Read the full episode description	Dec 03, 2021
Updating the Lottery Ticket Hypothesis by johnswentworth Read the full episode description	Dec 03, 2021
Siren worlds and the perils of over-optimised search by Stuart Armstrong Read the full episode description	Dec 03, 2021
Arguments about fast takeoff by Paul Christiano Read the full episode description	Dec 03, 2021
Alignment Newsletter #13: 07/02/18 by Rohin Shah Read the full episode description	Dec 03, 2021
Risks from Learned Optimization: Conclusion and Related Work by Evan Hubinger, Chris van Merwijk, Vladimir Mikulik, Joar Skalse, Scott Garrabrant Read the full episode description	Dec 03, 2021
2-D Robustness by Vladimir Mikulik Read the full episode description	Dec 03, 2021
Learning Normativity: A Research Agenda by Abram Demski Read the full episode description	Dec 03, 2021
To what extent is GPT-3 capable of reasoning? by Alex Turner Read the full episode description	Dec 03, 2021
Environmental Structure Can Cause Instrumental Convergence by Alex Turner Read the full episode description	Dec 03, 2021
[Book Review] "The Alignment Problem" by Brian Christian,Lsusr Read the full episode description	Dec 03, 2021
Comment on decision theory by Rob Bensinger Read the full episode description	Dec 03, 2021
The strategy-stealing assumption by Paul Christiano Read the full episode description	Dec 03, 2021
A simple environment for showing mesa misalignment by Matthew Barnett Read the full episode description	Dec 03, 2021
The two-layer model of human values, and problems with synthesizing preferences by Kaj Sotala Read the full episode description	Dec 03, 2021
The Goldbach conjecture is probably correct; so was Fermat's last theorem by Stuart Armstrong Read the full episode description	Dec 03, 2021
Eight claims about multi-agent AGI safety by Richard Ngo Read the full episode description	Dec 03, 2021
Why I'm excited about Debate by Richard Ngo Read the full episode description	Dec 03, 2021
Rogue AGI Embodies Valuable Intellectual Property by Mark Xu, CarlShulman Read the full episode description	Dec 03, 2021
Information At A Distance Is Mediated By Deterministic Constraints by johnswentworth Read the full episode description	Dec 03, 2021
Topological Fixed Point Exercises by Scott Garrabrant, Sam Eisenstat Read the full episode description	Dec 03, 2021
A Gym Gridworld Environment for the Treacherous Turn by Michaël Trazzi Read the full episode description	Dec 03, 2021
How does Gradient Descent Interact with Goodhart?Q by Scott Garrabrant Read the full episode description	Dec 03, 2021
Markets are Universal for Logical Induction by johnswentworth. Read the full episode description	Dec 03, 2021
TAI Safety Bibliographic Database by C. Jess Riedel. Read the full episode description	Dec 03, 2021
Announcing AlignmentForum.org Beta by Raymond Arnold. Read the full episode description	Dec 03, 2021
Classifying specification problems as variants of Goodhart's Law by Vika. Read the full episode description	Dec 03, 2021
Does SGD Produce Deceptive Alignment? by Mark Xu. Read the full episode description	Dec 03, 2021
Low-stakes alignment by Paul Christiano. Read the full episode description	Dec 03, 2021
Preface to the sequence on value learning by Rohin Shah. Read the full episode description	Dec 03, 2021
AGI safety from first principles: Superintelligence by Richard Ngo. Read the full episode description	Dec 03, 2021
FAQ: Advice for AI Alignment Researchers by Rohin Shah. Read the full episode description	Dec 03, 2021
AI takeoff story: a continuation of progress by other means by Edouard Harris. Read the full episode description	Dec 03, 2021
Optimization Concepts in the Game of Life by Vika, Ramana Kumar Read the full episode description	Dec 03, 2021
A general model of safety-oriented AI development by Wei Dai Read the full episode description	Dec 03, 2021
Why we need a theory of human values by Stuart Armstrong Read the full episode description	Dec 03, 2021
But exactly how complex and fragile? by KatjaGrace Read the full episode description	Dec 03, 2021
Continuing the takeoffs debate by Richard Ngo Read the full episode description	Dec 03, 2021
Progress on Causal Influence Diagrams by Tom Everitt Read the full episode description	Dec 03, 2021
When does rationality-as-search have nontrivial implications? by nostalgebraist Read the full episode description	Dec 03, 2021
Three AI Safety Related Ideas by Wei Dai Read the full episode description	Dec 03, 2021
Comments on CAIS by Richard Ngo Read the full episode description	Dec 03, 2021
Pavlov Generalizes by Abram Demski Read the full episode description	Dec 03, 2021
Preparing for "The Talk" with AI projects by Daniel Kokotajlo Read the full episode description	Dec 03, 2021
Using GPT-N to Solve Interpretability of Neural Networks: A Research Agenda by elriggs, Gurkenglas Read the full episode description	Dec 03, 2021
Announcing the Vitalik Buterin Fellowships in AI Existential Safety! by DanielFilan Read the full episode description	Dec 03, 2021
Three ways that "Sufficiently optimized agents appear coherent" can be false by Wei Dai Read the full episode description	Dec 03, 2021
Research Agenda v0.9: Synthesising a human's preferences into a utility function by Stuart Armstrong Read the full episode description	Dec 03, 2021
Conditions for Mesa-Optimization by Evan Hubinger, Chris van Merwijk, Vladimir Mikulik, Joar Skalse, Scott Garrabrant Read the full episode description	Dec 03, 2021
Six AI Risk/Strategy Ideas by Wei Dai Read the full episode description	Nov 30, 2021
Better priors as a safety problem by Paul Christiano Read the full episode description	Nov 30, 2021
Non-Obstruction: A Simple Concept Motivating Corrigibility by Alex Turner Read the full episode description	Nov 30, 2021
Three reasons to expect long AI timelines by Matthew Barnett Read the full episode description	Nov 30, 2021
Decoupling deliberation from competition by Paul Christiano Read the full episode description	Nov 30, 2021