A biweekly podcast where hosts Nathan Labenz and Erik Torenberg interview the builders on the edge of AI and explore the dramatic shift it will unlock in the coming years.
In this episode of The Cognitive Revolution, Nathan explores METR's groundbreaking REBench evaluation framework with Neev Parikh. We dive deep into how this new benchmark assesses AI systems' ability to perform real machine learning research tasks, from optimizing GPU kernels to fine-tuning language models. Join us for a fascinating discussion about the current capabilities of AI models like Claude 3.5 and GPT-4, and what their performance tells us about the trajectory of artificial intelligence development.
Check out METR's work:
blog post: https://metr.org/blog/2024-11-22-evaluating-r-d-capabilities-of-llms/
paper: https://metr.org/AI_R_D_Evaluation_Report.pdf
jobs: https://hiring.metr.org/
The Cognitive Revolution Ask Me Anything and Listener Survey: https://docs.google.com/forms/d/1aYv2XLID7RqGxj2_Y4_6x9mo_aqXcGCeLw1EQhy4IpY/edit
Help shape our show by taking our quick listener survey at https://bit.ly/TurpentinePulse
SPONSORS:
GiveWell: GiveWell has spent over 17 years researching global health and philanthropy to identify the highest-impact giving opportunities. Over 125,000 donors have contributed more than $2 billion, saving over 200,000 lives through evidence-backed recommendations. First-time donors can have their contributions matched up to $100 before year-end. Visit https://GiveWell.org, select podcast, and enter Cognitive Revolution at checkout to make a difference today.
SelectQuote: Finding the right life insurance shouldn't be another task you put off. SelectQuote compares top-rated policies to get you the best coverage at the right price. Even in our AI-driven world, protecting your family's future remains essential. Get your personalized quote at https://selectquote.com/cognitive
Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive
Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today.
CHAPTERS:
(00:00:00) Teaser
(00:01:04) About the Episode
(00:05:14) Introducing METR
(00:07:36) Specialization of AI Risk
(00:09:52) AI R&D vs. Autonomy
(00:12:41) Benchmark Design Choices
(00:16:04) Benchmark Design Principles (Part 1)
(00:18:54) Sponsors: GiveWell | SelectQuote
(00:21:44) Benchmark Design Principles (Part 2)
(00:22:35) AI vs. Human Evaluation
(00:26:55) Optimizing Runtimes
(00:36:02) Sponsors: Oracle Cloud Infrastructure (OCI) | Weights & Biases RAG++
(00:38:20) AI Myopia
(00:43:37) Optimizing Loss
(00:47:59) Optimizing Win Rate
(00:50:24) Best of K Analysis
(01:02:26) Best of K Limitations
(01:09:04) Agent Interaction Modalities
(01:12:34) Analyzing Benchmark Results
(01:17:16) Model Performance Differences
(01:22:49) Elicitation and Scaffolding
(01:27:08) Context Window & Best of K
(01:35:17) Reward Hacking & Bad Behavior
(01:43:47) Future Directions & Hiring
(01:46:20) Outro
SOCIAL LINKS:
Website: https://www.cognitiverevolution.ai
Twitter (Podcast): https://x.com/cogrev_podcast
Twitter (Nathan): https://x.com/labenz
Nathan discusses groundbreaking AI and biology research with Stanford Professor James Zou from the Chan Zuckerberg Initiative. In this episode of The Cognitive Revolution, we explore two remarkable papers: the virtual lab framework that created novel COVID treatments with minimal human oversight, and InterPLM's discovery of new protein motifs through mechanistic interpretability. Join us for an fascinating discussion about how AI is revolutionizing biological research and drug discovery.
Got questions about AI? Submit them for our upcoming AMA episode + take our quick listener survey to help us serve you better - https://docs.google.com/forms/d/e/1FAIpQLSefHvs1-1g5xeqM7wSirQkzTtK-1fgW_OjyHPH9DvmbVAjEzA/viewform
SPONSORS:
SelectQuote: Finding the right life insurance shouldn't be another task you put off. SelectQuote compares top-rated policies to get you the best coverage at the right price. Even in our AI-driven world, protecting your family's future remains essential. Get your personalized quote at https://selectquote.com/cognitive
Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive
80,000 Hours: 80,000 Hours is dedicated to helping you find a fulfilling career that makes a difference. With nearly a decade of research, they offer in-depth material on AI risks, AI policy, and AI safety research. Explore their articles, career reviews, and a podcast featuring experts like Anthropic CEO Dario. Everything is free, including their Career Guide. Visit https://80000hours.org/cognitiverevolution to start making a meaningful impact today.
GiveWell : GiveWell has spent over 17 years researching global health and philanthropy to identify the highest-impact giving opportunities. Over 125,000 donors have contributed more than $2 billion, saving over 200,000 lives through evidence-backed recommendations. First-time donors can have their contributions matched up to $100 before year-end. Visit https://GiveWell.org select podcast, and enter Cognitive Revolution at checkout to make a difference today.
CHAPTERS:
CHAPTERS:
(00:00:00) Teaser
(00:00:35) About the Episode
(00:04:30) Virtual Lab
(00:08:09) AI Designs Nanobodies
(00:14:43) Novel AI Pipeline
(00:20:31) Human-AI Interaction (Part 1)
(00:20:33) Sponsors: SelectQuote | Oracle Cloud Infrastructure (OCI)
(00:23:22) Human-AI Interaction (Part 2)
(00:32:31) Sponsors: 80,000 Hours | GiveWell
(00:35:10) Project Cost & Time
(00:41:04) Future of AI in Bio
(00:45:46) InterPLM: Intro
(00:50:30) AI Found New Concepts
(00:55:02) Discovering New Motifs
(00:57:14) Limitations & Future
(01:01:32) Outro
SOCIAL LINKS:
Website: https://www.cognitiverevolution.ai
Twitter (Podcast): https://x.com/cogrev_podcast
Twitter (Nathan): https://x.com/labenz
LinkedIn: https://www.linkedin.com/in/nathanlabenz/
Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast
Nathan welcomes back computational biochemist Amelie Schreiber for a fascinating update on AI's revolutionary impact in biology. In this episode of The Cognitive Revolution, we explore recent breakthroughs including AlphaFold3, ESM3, and new diffusion models transforming protein engineering and drug discovery. Join us for an insightful discussion about how AI is reshaping our understanding of molecular biology and making complex protein engineering tasks more accessible than ever before.
Help shape our show by taking our quick listener survey at https://bit.ly/TurpentinePulse
SPONSORS:
Shopify: Shopify is the world's leading e-commerce platform, offering a market-leading checkout system and exclusive AI apps like Quikly. Nobody does selling better than Shopify. Get a $1 per month trial at https://shopify.com/cognitive
SelectQuote: Finding the right life insurance shouldn't be another task you put off. SelectQuote compares top-rated policies to get you the best coverage at the right price. Even in our AI-driven world, protecting your family's future remains essential. Get your personalized quote at https://selectquote.com/cognitive
Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive
Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today.
CHAPTERS:
(00:00:00) Teaser
(00:00:46) About the Episode
(00:04:30) AI for Biology
(00:07:14) David Baker's Impact
(00:11:49) AlphaFold 3 & ESM3
(00:16:40) Protein Interaction Prediction (Part 1)
(00:16:44) Sponsors: Shopify | SelectQuote
(00:19:18) Protein Interaction Prediction (Part 2)
(00:31:12) MSAs & Embeddings (Part 1)
(00:32:32) Sponsors: Oracle Cloud Infrastructure (OCI) | Weights & Biases RAG++
(00:34:49) MSAs & Embeddings (Part 2)
(00:35:57) Beyond Structure Prediction
(00:51:13) Dynamics vs. Statics
(00:57:24) In-Painting & Use Cases
(00:59:48) Workflow & Platforms
(01:06:45) Design Process & Success Rates
(01:13:23) Ambition & Task Definition
(01:19:25) New Models: PepFlow & GeoAB
(01:28:23) Flow Matching vs. Diffusion
(01:30:42) ESM3 & Multimodality
(01:37:10) Summary & Future Directions
(01:45:34) Outro
SOCIAL LINKS:
Website: https://www.cognitiverevolution.ai
Twitter (Podcast): https://x.com/cogrev_podcast
Twitter (Nathan): https://x.com/labenz
LinkedIn: https://www.linkedin.com/in/nathanlabenz/
Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast
In this episode of The Cognitive Revolution, Nathan interviews Michael Boyce, Director of DHS's AI Corps, about bringing modern AI capabilities to federal government. We explore how the largest civilian AI team in government is transforming DHS's 22 agencies, from developing shared AI infrastructure to innovative applications like AI-powered asylum interview training. Join us for an insightful conversation about the intersection of artificial intelligence and public service, and discover why AI professionals should consider a career in government.
Help shape our show by taking our quick listener survey at https://bit.ly/TurpentinePulse
SPONSORS:
Shopify: Shopify is the world's leading e-commerce platform, offering a market-leading checkout system and exclusive AI apps like Quikly. Nobody does selling better than Shopify. Get a $1 per month trial at https://shopify.com/cognitive
SelectQuote: Finding the right life insurance shouldn't be another task you put off. SelectQuote compares top-rated policies to get you the best coverage at the right price. Even in our AI-driven world, protecting your family's future remains essential. Get your personalized quote at https://selectquote.com/cognitive
Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive
80,000 Hours: 80,000 Hours is dedicated to helping you find a fulfilling career that makes a difference. With nearly a decade of research, they offer in-depth material on AI risks, AI policy, and AI safety research. Explore their articles, career reviews, and a podcast featuring experts like Anthropic CEO Dario Amadei. Everything is free, including their Career Guide. Visit https://80000hours.org/cognitiverevolution to start making a meaningful impact today.
RECOMMENDED PODCAST:
Unpack Pricing - Dive into the dark arts of SaaS pricing with Metronome CEO Scott Woody and tech leaders. Learn how strategic pricing drives explosive revenue growth in today's biggest companies like Snowflake, Cockroach Labs, Dropbox and more.
Apple: https://podcasts.apple.com/us/podcast/id1765716600
Spotify: https://open.spotify.com/show/38DK3W1Fq1xxQalhDSueFg
CHAPTERS:
(00:00:00) Teaser
(00:01:00) About the Episode
(00:03:38) Introducing Michael Boyce
(00:05:49) What is Homeland Security?
(00:09:52) History of AI at DHS
(00:13:15) Generative AI at DHS
(00:16:03) Structure of the AI Core (Part 1)
(00:18:17) Sponsors: Shopify | SelectQuote
(00:20:51) Structure of the AI Core (Part 2)
(00:22:04) Opportunities for AI at DHS
(00:25:34) Bureaucracy Hacker
(00:30:34) The Manager's Role (Part 1)
(00:35:24) Sponsors: Oracle Cloud Infrastructure (OCI) | 80,000 Hours
(00:38:04) Internal Chatbot Project
(00:43:28) AI Role Playing for Training
(00:49:55) A Request for Startups
(00:57:46) Generative AI for Quality Check
(01:03:20) AI Training at DHS
(01:06:07) Metrics and the Future of AI
(01:13:26) Non-Generative AI at DHS
(01:19:08) AI and Automation at DHS
(01:23:03) Join the AI Core
(01:28:39) Outro
In this emergency episode of The Cognitive Revolution, Nathan discusses alarming findings about AI deception with Alexander Meinke from Apollo Research. They explore Apollo's groundbreaking 70-page report on "Frontier Models Are Capable of In-Context Scheming," revealing how advanced AI systems like OpenAI's O1 can engage in deceptive behaviors. Join us for a critical conversation about AI safety, the implications of scheming behavior, and the urgent need for better oversight in AI development.
Help shape our show by taking our quick listener survey at https://bit.ly/TurpentinePulse
SPONSORS:
Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive
SelectQuote: Finding the right life insurance shouldn't be another task you put off. SelectQuote compares top-rated policies to get you the best coverage at the right price. Even in our AI-driven world, protecting your family's future remains essential. Get your personalized quote at https://selectquote.com/cognitive
80,000 Hours: 80,000 Hours is dedicated to helping you find a fulfilling career that makes a difference. With nearly a decade of research, they offer in-depth material on AI risks, AI policy, and AI safety research. Explore their articles, career reviews, and a podcast featuring experts like Anthropic CEO Dario Amadei. Everything is free, including their Career Guide. Visit https://80000hours.org/cognitiverevolution to start making a meaningful impact today.
Shopify: Shopify is the world's leading e-commerce platform, offering a market-leading checkout system and exclusive AI apps like Quikly. Nobody does selling better than Shopify. Get a $1 per month trial at https://shopify.com/cognitive
RECOMMENDED PODCAST:
Unpack Pricing - Dive into the dark arts of SaaS pricing with Metronome CEO Scott Woody and tech leaders. Learn how strategic pricing drives explosive revenue growth in today's biggest companies like Snowflake, Cockroach Labs, Dropbox and more.
Apple: https://podcasts.apple.com/us/podcast/id1765716600
Spotify: https://open.spotify.com/show/38DK3W1Fq1xxQalhDSueFg
CHAPTERS:
(00:00:00) Teaser
(00:00:53) About the Episode
(00:08:10) Introducing Alexander Meinke
(00:10:17) Red Teaming GPT-4
(00:17:07) Chain of Thought Access (Part 1)
(00:20:24) Sponsors: Oracle Cloud Infrastructure (OCI) | SelectQuote
(00:22:48) Chain of Thought Access (Part 2)
(00:26:07) Multimodal Models
(00:29:33) Defining Scheming
(00:33:51) Taxonomy of Scheming (Part 1)
(00:39:40) Sponsors: 80,000 Hours | Shopify
(00:42:29) Taxonomy of Scheming (Part 2)
(00:43:09) Instruction Hierarchy
(00:49:04) Types of Scheming
(01:00:49) Covert Subversion
(01:14:25) Deferred Subversion
(01:28:24) Sandbagging
(01:35:48) Magnitudes & Trends
(01:48:18) Chain of Thought Reasoning
(01:57:02) Closing Thoughts
(02:05:19) Outro
PRODUCED BY:
In this episode of The Cognitive Revolution, Nathan interviews Andrew White, Professor of Chemical Engineering at the University of Rochester and Head of Science at Future House. We explore groundbreaking AI systems for scientific discovery, including PaperQA and Aviary, and discuss how large language models are transforming research. Join us for an insightful conversation about the intersection of AI and scientific advancement with this pioneering researcher in his first-ever podcast appearance.
Check out Future House: https://www.futurehouse.org
Help shape our show by taking our quick listener survey at https://bit.ly/TurpentinePulse
SPONSORS:
Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive
SelectQuote: Finding the right life insurance shouldn't be another task you put off. SelectQuote compares top-rated policies to get you the best coverage at the right price. Even in our AI-driven world, protecting your family's future remains essential. Get your personalized quote at https://selectquote.com/cognitive
Shopify: Shopify is the world's leading e-commerce platform, offering a market-leading checkout system and exclusive AI apps like Quikly. Nobody does selling better than Shopify. Get a $1 per month trial at https://shopify.com/cognitive
CHAPTERS:
(00:00:00) Teaser
(00:01:13) About the Episode
(00:04:37) Andrew White's Journey
(00:10:23) GPT-4 Red Team
(00:15:33) GPT-4 & Chemistry
(00:17:54) Sponsors: Oracle Cloud Infrastructure (OCI) | SelectQuote
(00:20:19) Biology vs Physics
(00:23:14) Conceptual Dark Matter
(00:26:27) Future House Intro
(00:30:42) Semi-Autonomous AI
(00:35:39) Sponsors: Shopify
(00:37:00) Lab Automation
(00:39:46) In Silico Experiments
(00:45:22) Cost of Experiments
(00:51:30) Multi-Omic Models
(00:54:54) Scale and Grokking
(01:00:53) Future House Projects
(01:10:42) Paper QA Insights
(01:16:28) Generalizing to Other Domains
(01:17:57) Using Figures Effectively
(01:22:01) Need for Specialized Tools
(01:24:23) Paper QA Cost & Latency
(01:27:37) Aviary: Agents & Environments
(01:31:42) Black Box Gradient Estimation
(01:36:14) Open vs Closed Models
(01:37:52) Improvement with Training
(01:40:00) Runtime Choice & Q-Learning
(01:43:43) Narrow vs General AI
(01:48:22) Future Directions & Needs
(01:53:22) Future House: What's Next?
(01:55:32) Outro
SOCIAL LINKS:
Website: https://www.cognitiverevolution.ai
Twitter (Podcast): https://x.com/cogrev_podcast
Twitter (Nathan): https://x.com/labenz
LinkedIn: https://www.linkedin.com/in/nathanlabenz/
Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast
Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk
In this episode of The Cognitive Revolution, Nathan welcomes back Div Garg, Co-Founder and CEO of MultiOn, for his third appearance to discuss the evolving landscape of AI agents. We explore how agent development has shifted from open-ended frameworks to intelligent workflows, MultiOn's unique approach to agent development, and their journey toward achieving human-level performance. Dive into fascinating insights about data collection strategies, model fine-tuning techniques, and the future of agent authentication. Join us for an in-depth conversation about why 2025 might be the breakthrough year for AI agents.
Check out MultiOn: https://www.multion.ai/
Help shape our show by taking our quick listener survey at https://bit.ly/TurpentinePulse
SPONSORS:
Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive
SelectQuote: Finding the right life insurance shouldn't be another task you put off. SelectQuote compares top-rated policies to get you the best coverage at the right price. Even in our AI-driven world, protecting your family's future remains essential. Get your personalized quote at https://selectquote.com/cognitive
Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today.
RECOMMENDED PODCAST:
Unpack Pricing - Dive into the dark arts of SaaS pricing with Metronome CEO Scott Woody and tech leaders. Learn how strategic pricing drives explosive revenue growth in today's biggest companies like Snowflake, Cockroach Labs, Dropbox and more.
Apple: https://podcasts.apple.com/us/podcast/id1765716600
Spotify: https://open.spotify.com/show/38DK3W1Fq1xxQalhDSueFg
CHAPTERS:
(00:00:00) Teaser
(00:00:40) About the Episode
(00:04:10) The Rise of AI Agents
(00:06:33) Open-Ended vs On-Rails
(00:10:00) Agent Architecture
(00:12:01) AI Learning & Feedback
(00:14:01) Data Collection (Part 1)
(00:18:27) Sponsors: Oracle Cloud Infrastructure (OCI) | SelectQuote
(00:20:51) Data Collection (Part 2)
(00:22:25) Self-Play & Rewards
(00:25:04) Model Strategy & Agent Q
(00:33:28) Sponsors: Weights & Biases RAG++
(00:34:39) Understanding Agent Q
(00:43:16) Search & Learning
(00:45:39) Benchmarks vs Reality
(00:50:18) Positive Transfer & Scale
(00:51:47) Fine-Tuning Strategies
(00:55:16) Vision Strategy
(01:00:16) Authentication & Security
(01:03:48) Future of AI Agents
(01:16:14) Cost, Latency, Reliability
(01:19:30) Avoiding the Bitter Lesson
(01:25:58) Agent-Assisted Future
(01:27:11) Outro
SOCIAL LINKS:
Website: https://www.cognitiverevolution.ai
Twitter (Podcast): https://x.com/cogrev_podcast
Twitter (Nathan): https://x.com/labenz
LinkedIn: https://www.linkedin.com/in/nathanlabenz/
Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast
Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk
In this episode of The Cognitive Revolution, Nathan explores groundbreaking perspectives on AI alignment with MIT PhD student Tan Zhi Xuan. We dive deep into Xuan's critique of preference-based AI alignment and their innovative proposal for role-based AI systems guided by social consensus. The conversation extends into their fascinating work on how AI agents can learn social norms through Bayesian rule induction. Join us for an intellectually stimulating discussion that bridges philosophical theory with practical implementation in AI development.
Check out:
"Beyond Preferences in AI Alignment" paper: https://arxiv.org/pdf/2408.16984
"Learning and Sustaining Shared Normative Systems via Bayesian Rule Induction in Markov Games" paper: https://arxiv.org/pdf/2402.13399
Help shape our show by taking our quick listener survey at https://bit.ly/TurpentinePulse
SPONSORS:
Notion: Notion offers powerful workflow and automation templates, perfect for streamlining processes and laying the groundwork for AI-driven automation. With Notion AI, you can search across thousands of documents from various platforms, generating highly relevant analysis and content tailored just for you - try it for free at https://notion.com/cognitiverevolution
Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today.
Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive
RECOMMENDED PODCAST:
Unpack Pricing - Dive into the dark arts of SaaS pricing with Metronome CEO Scott Woody and tech leaders. Learn how strategic pricing drives explosive revenue growth in today's biggest companies like Snowflake, Cockroach Labs, Dropbox and more.
Apple: https://podcasts.apple.com/us/podcast/id1765716600
Spotify: https://open.spotify.com/show/38DK3W1Fq1xxQalhDSueFg
CHAPTERS:
(00:00:00) Teaser
(00:01:09) About the Episode
(00:04:25) Guest Intro
(00:06:25) Xuan's Background
(00:12:03) AI Near-Term Outlook
(00:17:32) Sponsors: Notion | Weights & Biases RAG++
(00:20:18) Alignment Approaches
(00:26:11) Critiques of RLHF
(00:34:40) Sponsors: Oracle Cloud Infrastructure (OCI)
(00:35:50) Beyond Preferences
(00:40:27) Roles and AI Systems
(00:45:19) What AI Owes Us
(00:51:52) Drexler's AI Services
(01:01:08) Constitutional AI
(01:09:43) Technical Approach
(01:22:01) Norms and Deviations
(01:32:31) Norm Decay
(01:38:06) Self-Other Overlap
(01:44:05) Closing Thoughts
(01:54:23) Outro
SOCIAL LINKS:
Website: https://www.cognitiverevolution.ai
Twitter (Podcast): https://x.com/cogrev_podcast
Twitter (Nathan): https://x.com/labenz
LinkedIn: https://www.linkedin.com/in/nathanlabenz/
Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast
Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk
In this episode of The Cognitive Revolution, Nathan explores the complex intersection of AI development and international relations with Robert Wright, publisher of the Nonzero Newsletter. They examine the growing militarization of AI, US-China relations, and the concerning trajectory of what Wright calls "the chip war." Drawing from Nathan's recent experience at The Curve conference and an AI wargame simulation, they discuss the risks of an AI arms race and search for alternative paths to avoid catastrophic conflict between global powers. Join us for this crucial conversation about the future of AI governance and international cooperation.
Subscribe to The NonZero Newsletter at https://nonzero.substack.com
Be notified early when Turpentine's drops new publication: https://www.turpentine.co/exclusiveaccess
RECOMMENDED PODCAST:
Unpack Pricing - Dive into the dark arts of SaaS pricing with Metronome CEO Scott Woody and tech leaders. Learn how strategic pricing drives explosive revenue growth in today's biggest companies like Snowflake, Cockroach Labs, Dropbox and more.
Apple: https://podcasts.apple.com/us/podcast/id1765716600
Spotify: https://open.spotify.com/show/38DK3W1Fq1xxQalhDSueFg
SPONSORS:
Notion: Notion offers powerful workflow and automation templates, perfect for streamlining processes and laying the groundwork for AI-driven automation. With Notion AI, you can search across thousands of documents from various platforms, generating highly relevant analysis and content tailored just for you - try it for free at https://notion.com/cognitiverevolution
80,000 Hours: 80,000 Hours offers free one-on-one career advising for Cognitive Revolution listeners aiming to tackle global challenges, especially in AI. They connect high-potential individuals with experts, opportunities, and personalized career plans to maximize positive impact. Apply for a free call at https://80000hours.org/cognitiverevolution to accelerate your career and contribute to solving pressing AI-related issues.
SelectQuote: Finding the right life insurance shouldn't be another task you put off. SelectQuote compares top-rated policies to get you the best coverage at the right price. Even in our AI-driven world, protecting your family's future remains essential. Get your personalized quote at https://selectquote.com/cognitive
Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive
CHAPTERS:
(00:00:00) Teaser
(00:00:59) Sponsors: Incogni
(00:02:20) About the Episode
(00:05:56) Introducing AI2
(00:09:56) Tulu: Deep Dive (Part 1)
(00:17:43) Sponsors: Notion | Shopify
(00:20:38) Open vs. Closed Recipes
(00:29:48) Compute & Value (Part 1)
(00:34:22) Sponsors: Oracle Cloud Infrastructure (OCI) | 80,000 Hours
(00:37:02) Compute & Value (Part 2)
(00:42:41) Model Weight Evolution
(00:53:16) DPO vs. PPO
(01:06:36) Project Trajectory
(01:20:39) Synthetic Data & LLM Judge
(01:27:39) Verifiable RL
(01:38:17) Advice for Practitioners
(01:44:01) Open Source vs. Closed
(01:49:18) Outro
SOCIAL LINKS:
Website: https://www.cognitiverevolution.ai
Twitter (Podcast): https://x.com/cogrev_podcast
Twitter (Nathan): https://x.com/labenz
LinkedIn: https://www.linkedin.com/in/nathanlabenz/
Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast
Nathan explores the world of AI-powered design with John Milinovich, Head of Generative AI Product at Canva. In this episode of The Cognitive Revolution, we dive into Canva's innovative approach to AI integration, from task automation to human augmentation. Join us for an insightful discussion about fine-tuning foundation models, AI's impact on architecture, and practical tips for AI product development at scale.
Check out Canva: https://www.canva.com
Be notified early when Turpentine's drops new publication: https://www.turpentine.co/exclusiveaccess
SPONSORS:
SelectQuote: Finding the right life insurance shouldn't be another task you put off. SelectQuote compares top-rated policies to get you the best coverage at the right price. Even in our AI-driven world, protecting your family's future remains essential. Get your personalized quote at https://selectquote.com/cognitive
Incogni: Take your personal data back with Incogni! Use code REVOLUTION at the link below and get 60% off an annual plan: https://incogni.com/revolution
Shopify: Shopify is the world's leading e-commerce platform, offering a market-leading checkout system and exclusive AI apps like Quikly. Nobody does selling better than Shopify. Get a $1 per month trial at https://shopify.com/cognitive
Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive
Brave: The Brave search API can be used to assemble a data set to train your AI models and help with retrieval augmentation at the time of inference. All while remaining affordable with developer first pricing, integrating the Brave search API into your workflow translates to more ethical data sourcing and more human representative data sets. Try the Brave search API for free for up to 2000 queries per month at https://bit.ly/BraveTCR
RECOMMENDED PODCAST:
Unpack Pricing - Dive into the dark arts of SaaS pricing with Metronome CEO Scott Woody and tech leaders. Learn how strategic pricing drives explosive revenue growth in today's biggest companies like Snowflake, Cockroach Labs, Dropbox and more.
Apple: https://podcasts.apple.com/us/podcast/id1765716600
Spotify: https://open.spotify.com/show/38DK3W1Fq1xxQalhDSueFg
CHAPTERS:
(00:00:00) Teaser
(00:01:13) Sponsors: SelectQuote
(00:02:26) About the Episode
(00:04:33) Introduction - Creativity vs Design
(00:08:39) AI-Assisted Experiences
(00:10:25) Automation & Augmentation
(00:15:27) Pixels to Objects to Concepts
(00:17:58) Sponsors: Incogni | Shopify
(00:20:40) Concept-Level Interfaces
(00:23:35) The Future of Design
(00:29:39) Human Element in Design
(00:32:49) AI Talking to AI
(00:35:52) Sponsors: Oracle Cloud Infrastructure (OCI) | Brave
(00:38:04) Purpose-Specific AI Experiences
(00:45:29) GPT-4 Image Editing
(00:51:17) Graduated Approach to Launch
(00:55:09) Fine-Tuning GPT-4
(00:59:10) Cost & Latency
(01:01:29) Hiring AI Engineers
(01:05:02) Engineering Best Practices
(01:09:00) Inspiration in the AI Space
(01:18:28) The Gen AI Application Layer
(01:24:26) Outro
SOCIAL LINKS:
Website: https://www.cognitiverevolution.ai
Twitter (Podcast): https://x.com/cogrev_podcast
Twitter (Nathan): https://x.com/labenz
LinkedIn: https://www.linkedin.com/in/nathanlabenz/
Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast
Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk
In this episode of The Cognitive Revolution, we dive deep into frontier post-training techniques for large language models with Nathan Lambert from the Allen Institute for AI. Nathan discusses the groundbreaking Tulu 3 release, which matches Meta's post-training performance using the LlAMA base model. We explore supervised fine-tuning, preference-based reinforcement learning, and the innovative reinforcement learning from verifiable reward technique. Nathan provides unprecedented insights into the practical aspects of model development, compute requirements, and data generation strategies. This technically rich conversation illuminates previously opaque aspects of LLM development, achieved by a small team of 10-15 people. Join us for one of our most detailed and valuable discussions on state-of-the-art AI model development.
Check out Nathan's Lambert newsletter:
Be notified early when Turpentine's drops new publication: https://www.turpentine.co/exclusiveaccess
SPONSORS:
Incogni: Take your personal data back with Incogni! Use code REVOLUTION at the link below and get 60% off an annual plan: https://incogni.com/revolution
Notion: Notion offers powerful workflow and automation templates, perfect for streamlining processes and laying the groundwork for AI-driven automation. With Notion AI, you can search across thousands of documents from various platforms, generating highly relevant analysis and content tailored just for you - try it for free at https://notion.com/cognitiverevolution
Shopify: Shopify is the world's leading e-commerce platform, offering a market-leading checkout system and exclusive AI apps like Quikly. Nobody does selling better than Shopify. Get a $1 per month trial at https://shopify.com/cognitive
Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive
80,000 Hours: 80,000 Hours offers free one-on-one career advising for Cognitive Revolution listeners aiming to tackle global challenges, especially in AI. They connect high-potential individuals with experts, opportunities, and personalized career plans to maximize positive impact. Apply for a free call at https://80000hours.org/cognitiverevolution to accelerate your career and contribute to solving pressing AI-related issues.
RECOMMENDED PODCAST:
Unpack Pricing - Dive into the dark arts of SaaS pricing with Metronome CEO Scott Woody and tech leaders. Learn how strategic pricing drives explosive revenue growth in today's biggest companies like Snowflake, Cockroach Labs, Dropbox and more.
Apple: https://podcasts.apple.com/us/podcast/id1765716600
Spotify: https://open.spotify.com/show/38DK3W1Fq1xxQalhDSueFg
CHAPTERS:
(00:00:00) Teaser
(00:00:59) Sponsors: Incogni
(00:02:20) About the Episode
(00:05:56) Introducing AI2
(00:09:56) Tulu: Deep Dive (Part 1)
(00:17:43) Sponsors: Shopify | Oracle Cloud Infrastructure (OCI)
(00:20:38) Open vs. Closed Recipes
(00:29:48) Compute & Value (Part 1)
(00:34:22) Sponsors: 80,000 Hours | Notion
(00:37:02) Compute & Value (Part 2)
(00:42:41) Model Weight Evolution
(00:53:16) DPO vs. PPO
(01:06:36) Project Trajectory
(01:20:39) Synthetic Data & LLM Judge
(01:27:39) Verifiable RL
(01:38:17) Advice for Practitioners
(01:44:01) Open Source vs. Closed
(01:49:18) Outro
Your feedback is valuable to us. Should you encounter any bugs, glitches, lack of functionality or other problems, please email us on [email protected] or join Moon.FM Telegram Group where you can talk directly to the dev team who are happy to answer any queries.