The Cloudcast

Massive Studios

<p>The Cloudcast (@cloudcastpod) is the industry's #1 Cloud Computing podcast, and the place where Cloud meets AI.  Co-hosts Aaron Delp (@aarondelp) &amp; Brian Gracely (@bgracely) speak with technology and business leaders that are shaping the future of business. Topics will include Cloud Computing | AI | AGI | ChatGPT | Open Source | AWS | Azure | GCP | Platform Engineering | DevOps | Big Data | ML | Security | Kubernetes | AppDev | SaaS | PaaS . </p>

  • 35 minutes 13 seconds
    The Zero-CVE Mirage: Hardening Software in the Age of AI Attacks

    SUMMARY: How software development is rapidly evolving in the age of AI and automation. Matt Moore shares how his team is rethinking secure software supply chains, scaling infrastructure, and safely integrating AI agents into development workflows.

    GUEST: Matt Moore, CTO at Chainguard 

    SHOW: 1022

    SHOW TRANSCRIPT: The Reasoning Show #1022 Transcript

    SHOW VIDEO: https://youtu.be/9Q0kWkTYRs8

    SHOW SPONSORS:

    SHOW NOTES:


    Scaling Challenges & “Factory” Evolution

    • Early automation relied on tools like GitHub Actions
    • At scale, simple systems broke due to:
      • Massive event volumes
      • API rate limits (e.g., GitHub quotas)
      • Exponential fan-out effects
    • Key innovation: custom work queue + reconciliation model
      • ~90% event deduplication
      • Controlled throughput and backpressure
      • Improved reliability and system stability
    • Introduced Driftless 
    • Built on reconciliation principles (inspired by Kubernetes):
      • Compare desired vs. actual state
      • Continuously reconcile differences
    • Benefits:
      • Resilience to missed events
      • Automatic retries and recovery
      • Scales better than purely event-driven systems

    AI Agents in Software Development

    • AI is dramatically accelerating development workflows
    • Chainguard uses agents to:
      • Remediate vulnerabilities (CVEs)
      • Update dependencies
      • Fix failing tests and adapt to upstream changes

    Key Design Philosophy

    • Least privilege → “least tool call”
      • Avoid giving agents full system access
      • Provide narrowly scoped tools for specific tasks
    • Delegate execution to sandboxed systems (e.g., CI pipelines)
    • Focus on safe, controlled automation

    Industry Shift: Velocity vs. Security

    • Explosion of AI-driven tools (e.g., autonomous PR generation)
    • Massive increase in development velocity
    • New risks:
      • Poorly secured agent frameworks
      • Malicious or unsafe automation patterns

    Key Takeaways

    1. Scale changes everything
      • Simple systems break under massive workloads
      • Purpose-built infrastructure becomes necessary
    2. Reconciliation > pure event-driven systems at scale
      • More resilient, predictable, and controllable
    3. AI is a force multiplier—but requires guardrails
      • Unrestricted agents introduce serious risk
      • Constrained, purpose-built agents are safer and more effective
    4. Continuous learning is mandatory
      • AI tooling is evolving too fast for static skillsets
      • Teams must actively experiment and adapt

    FEEDBACK?

    26 April 2026, 5:00 am
  • 25 minutes 23 seconds
    The Grid’s Breaking Point: Can AI Save the Infrastructure It’s About to Crash?

    SUMMARY: How real-time power flow optimization at the edge is helping data centers and the electrical grid handle surging AI energy demands more efficiently. By unlocking hidden capacity and dynamically managing power systems, we explain how existing infrastructure can support significantly more compute without massive new buildouts.

    GUEST: Marissa Hummon, CTO Utilidata

    SHOW: 1021

    SHOW TRANSCRIPT: The Reasoning Show #1021 Transcript

    SHOW VIDEO: https://youtu.be/ItcpU8UjOFE

    SHOW SPONSORS:

    SHOW NOTES:

    KEY TOPICS:

    • Differences between grid power dynamics vs. AI workloads
    • Edge AI for real-time power flow optimization
    • Unlocking stranded capacity in existing infrastructure
    • “4-to-make-3” vs. “4-to-make-4” data center design
    • AI training vs. inference power consumption patterns
    • Role of NVIDIA-powered edge compute modules
    • Grid modernization and coordination with utilities
    • Security and resilience in critical infrastructure

    KEY MOMENTS:

    • From centralized AI models to edge-based decision-making
    • Defining efficiency: utilization vs. thermal performance
    • Why AI workloads aren’t as constant as they seem
    • NVIDIA partnership and edge compute in power systems
    • Using redundancy to increase usable capacity
    • Increasing density of AI compute and hidden capacity
    • Data center vs. utility responsibilities
    • Addressing data center bottlenecks and scaling challenges
    • Customer landscape: hyperscalers to enterprise
    • Security, resilience, and critical infrastructure

    KEY INSIGHTS:

    • AI workloads are dynamic, not constant: Training and inference create fluctuating power demands that can be optimized.
    • Edge intelligence is critical: Real-time sensing and decision-making at the edge unlock efficiency gains not possible with centralized models.
    • Hidden capacity exists: Many data centers have up to 2x unused power capacity due to lack of visibility and control.
    • Software-defined power is the future: Faster control loops allow systems to safely exceed traditional design limits.
    • Efficiency = utilization: The biggest gains come from better use of existing infrastructure, not just improving hardware efficiency.

    TAKEAWAYS:

    • AI infrastructure growth is as much an energy challenge as a compute challenge
    • Real-time, edge-based control systems are key to scaling sustainably
    • Existing grid and data center investments can go further with smarter orchestration
    • The future of AI scaling depends on aligning compute innovation with energy intelligence

    FEEDBACK?

    22 April 2026, 5:00 am
  • 29 minutes 23 seconds
    Shadow AI is Faster Than Your Governance: Why Guardrails are Failing

    SUMMARY: Shadow AI is growing much faster than known AI adoption across businesses. How can IT teams get Shadow AI under control?

    GUEST: Uri Haramati, CEO at Torii

    SHOW: 1020

    SHOW TRANSCRIPT: The Reasoning Show #1020 Transcript

    SHOW VIDEO: https://youtu.be/AUrh_xICPzM

    SHOW SPONSORS:

    SHOW NOTES:


    Topic 1 - Welcome to the show. Tell us about your background and your focus at Torii. 

    Topic 2 - Is Shadow AI really a security problem—or is it a product-market fit problem inside the enterprise?

    Topic 3 - Why does Shadow AI spread faster—and become more dangerous—than traditional Shadow IT?

    Topic 4 - What’s the first signal a company should look for to know Shadow AI is already happening?

    Topic 5 - How do you balance visibility vs. control without killing the productivity gains that drove Shadow AI in the first place?

    Topic 6 - How should organizations rethink ‘data loss prevention’ in a world where the leak is a prompt, not a file?

    Topic 7 - What does a ‘well-governed’ AI environment actually look like in practice—day-to-day for an employee?

    Topic 8 - “Do you think Shadow AI ever fully goes away—or does it become a permanent operating model that companies need to design around?”

    FEEDBACK?

    19 April 2026, 5:00 am
  • 33 minutes 19 seconds
    The Junior Dev Crisis: Who Inherits the Code When AI Does the Work?

    SUMMARY: Have we reached a point where coding is a solved problem? And if so, what are the downstream effects on companies that need software to differentiate their business?

    GUEST: Brandon Whichard, Co-Host of Software Defined Talk

    SHOW: 1019

    SHOW TRANSCRIPT: The Reasoning Show #1019 Transcript

    SHOW VIDEO: https://youtu.be/q0mksIKcBzk

    SHOW SPONSORS:

    SHOW NOTES:

    [Via ChatGPT]  A useful way to think about it:

    • Typing code → mostly commoditized
    • Designing systems → partially assisted
    • Owning outcomes → still very human

    Topic 1 - How many years into Public Cloud did we assume that Cloud had solved the IT problem? 

    Topic 2 - Developers - what are we solving for?

    • 10% of time coding, mostly on the last 10-15% 
    • Lots of time in planning meetings (decoding requirements, resource planning, updates, etc.)
    • Decent amount of time fixing, troubleshooting, technical debt reduction

    Topic 2a - Business people have unlimited ideas, and most ideas are money + tech

    • What would be their interface to problem solving without developers? (is this just a shift to consultants)
    • Is this a massive opportunity for a great PaaS 3.0 company (e.g. is Vercel an example?)

    Topic 3 - [Hypothetical] Let’s assume a fairly normal company fired all their software developers tomorrow. How long before they could get a moderately complex new application of integration into production? 

    Topic 4 - Nobody likes to work on legacy code - missing source, missing engineers, etc. What do we call any code written by AI that was abandoned within the last 6-12 months? 

    FEEDBACK?

    15 April 2026, 5:00 am
  • 28 minutes 42 seconds
    RAG Won’t Save Your Messy Data: The Brutal Truth About AI Reliability

    SUMMARY: The RAG (Retrieval Augmented Generation) pattern is one of the most frequently used to augment LLMs with context-specific information. Let’s explore RAG. 

    GUEST: Roie Schwaber-Cohen, Head of Developer Relations at Pinecone

    SHOW: 1018

    SHOW TRANSCRIPT: The Reasoning Show #1018 Transcript

    SHOW VIDEO: https://youtu.be/-kZZEMR341Q

    SHOW SPONSORS:

    SHOW NOTES:

    Topic 1 - Welcome to the show. Tell us a little bit about your background, and what you focus on these days at Pinecone 

    Topic 2 - Let’s begin by talking about RAG systems. What are they? Why do companies choose to use them? What benefits do they provide in AI systems?

    Topic 3 - At a high level, RAG sounds straightforward—retrieve relevant context, generate an answer. But in practice, where does it break first as systems scale?

    Topic 4 - I’ve heard that RAG systems can return answers that are technically correct but fundamentally wrong. What’s a concrete example of that happening in production—and why does it slip past most teams?

    Topic 5 - In traditional systems, we assume there’s a single source of truth. But in enterprise environments, ‘truth’ is often versioned, contextual, and conflicting. How should teams rethink ‘truth’ when building AI systems?

    Topic 6 - A lot of teams assume their knowledge base is ‘good enough’ for RAG. What do they usually underestimate about the messiness of real enterprise data?

    Topic 7 - There’s a growing narrative that better reasoning models can compensate for weaker retrieval. From what you’ve seen, where does that idea fall apart?

    Topic 8 - If correctness depends on things like timing, policy scope, or configuration, how should teams design systems that understand context—not just content?

    Topic 9 - Looking ahead, what replaces today’s RAG architectures? What patterns are emerging among teams that are actually getting this right?”


    FEEDBACK?

    12 April 2026, 5:00 am
  • 33 minutes 50 seconds
    The Productivity Paradox: Why More AI Code is Slowing Down Shiptimes

    SUMMARY:  Discover how AI is transforming software development and what it means for engineering leaders. 

    GUEST: Jeff Keyes, Field CTO at AllStacks 

    SHOW: 1017

    SHOW TRANSCRIPT: The Reasoning Show #1017 Transcript

    SHOW VIDEO: https://youtu.be/cXPu8iWeB0k

    SHOW SPONSORS:

    SHOW NOTES:

    Topic 1 - Welcome to the show. Tell us a little bit about your background, and what you focus on these days at AllStacks. 

    Topic 2 - You’ve been talking to a lot of engineering leaders using AI coding tools—what’s the most surprising gap you’re seeing between increased code generation and actual delivery outcomes?

    Topic 3 - Why does increasing developer output with AI often lead to more debugging, duplication, or cleanup instead of faster delivery?

    Topic 4 - You’ve described an ‘invisible rework loop’—can you walk us through what that looks like inside a modern engineering team?

    Topic 5 - As code generation gets easier, where does the real bottleneck shift in the software delivery lifecycle?

    Topic 6 - How do unclear product or engineering specifications get amplified in an AI-assisted development environment?

    Topic 7 - If traditional metrics like lines of code or velocity are becoming misleading, what should engineering leaders actually measure to know if AI is improving delivery?

    Topic 8 - What does a ‘healthy’ AI-assisted development workflow look like 12–18 months from now?


    FEEDBACK?

    8 April 2026, 5:00 am
  • 32 minutes 33 seconds
    The Production Chaos: Why AI-Generated Code is Breaking Traditional SRE

    SUMMARY: With the explosion of AI-generated code and applications, the modern SRE requires an AI-native approach to managing complex systems. 

    GUEST: Anish Agarwal - CEO/Cofounder of Traversal

    SHOW: 1016

    SHOW TRANSCRIPT: The Reasoning Show #1016 Transcript

    SHOW VIDEO: https://youtu.be/hF3MCRDhMno

    SHOW SPONSORS:

    SHOW NOTES:

    Topic 1 - Welcome to the show. Tell us a little bit about your background, and what you focus on these days at Traversal. 

    Topic 2 - AI is dramatically accelerating code generation, but not improving production outcomes. What’s fundamentally breaking in the traditional SRE model—and where do you see the biggest friction between speed and reliability?

    Topic 3 - What are the most common failure patterns or mistakes you’re seeing in production from AI-generated code—and what’s driving them?

    Topic 4 - AI can generate functional code, but it often lacks context about how systems behave in production. How is this changing what ‘good observability’ needs to look like?

    Topic 5 - How do you see SRE evolving in an AI-first world? Does it become more automated, more policy-driven, or even partially autonomous?

    Topic 6 - For organizations that want to embrace AI-assisted development but avoid production chaos, what are the most important guardrails they should put in place?

    Topic 7 - If we fast-forward 2–3 years, what does a ‘modern’ production stack look like in a world where most code is AI-generated? What capabilities become absolutely essential? In one sentence—what’s the #1 thing a CTO should do right now?

    FEEDBACK?

    5 April 2026, 5:00 am
  • 34 minutes 6 seconds
    The Future of Service belongs to Self-Improving AI

    SUMMARY:  Today’s episode is all about a transformation happening in customer service—one that’s moving us from static systems and scripted workflows into something far more dynamic: AI systems that can actually learn and improve over time.

    GUEST: Shashi Upadhyay (President of Product, Engineering, and AI at Zendesk)

    SHOW: 1015

    SHOW TRANSCRIPT: The Reasoning Show #1015 Transcript

    SHOW VIDEO: https://youtu.be/IQaxE-DjIpo

    SHOW SPONSORS:

    SHOW NOTES:

    Topic 1 - Welcome to the show. Tell us a bit about your background and your focus today. 

    Topic 2 - You describe this moment as a shift from systems of record to intelligent systems of action. What’s fundamentally broken in today’s customer service model that’s forcing this transition now? What changed in the last 2–3 years to make this possible?

    Topic 3 - There’s been a lot of AI in customer service that overpromised and underdelivered. What are the biggest gaps between what customers actually need—like resolution—and what legacy automation has been delivering?

    Topic 4 - The concept of a “self-improving” system is really powerful. What’s actually new here—what enables AI to improve with every interaction without constant human tuning?

    Topic 5 - You’ve moved from assistive copilots to what you call “agentic AI” that can resolve issues end-to-end. Where are we today on that journey—and what still requires human involvement?

    Topic 6 - Voice has historically been one of the hardest channels to automate. What changes with this new generation of AI that makes even complex, multi-step voice interactions solvable?

    Topic 7 - If we fast-forward 2–3 years, what does a “best-in-class” customer service experience look like in an AI-first world?

    FEEDBACK?

    1 April 2026, 5:00 am
  • 39 minutes 56 seconds
    The $26B Pivot: Why Big Tech is Abandoning the AI "Wrapper" Model

    SUMMARY:  Brian (@bgracely) and Brandon Whichard (@bwhichard, Software Defined Talk and Failover Media) discuss the biggest AI news stories from the month of March, 2026. 

    SHOW: 1014

    SHOW TRANSCRIPT: The Reasoning Show #1014 Transcript

    SHOW VIDEO: https://youtu.be/XwyAC-hxOQY

    SHOW SPONSORS:

    • VENTION - Ready for expert developers who actually deliver?
      Visit ventionteams.com

    SHOW NOTES:

    FEEDBACK?

    29 March 2026, 5:00 am
  • 36 minutes 56 seconds
    Living the Claude-centric Life

    SUMMARY: With @bwhichard, we dig into how daily work-life changes when you make @AnthropicAI @claudeai the center of all workflow activities. 

    SHOW: 1013

    SHOW TRANSCRIPT: The Reasoning Show #1013 Transcript

    SHOW VIDEO: https://youtu.be/zEmEH0t67js

    SHOW SPONSORS:

    • VENTION - Ready for expert developers who actually deliver?
      Visit ventionteams.com

    SHOW NOTES:

    Topic 1 - How long have you been living the Claude-life, and when did it dawn on you to make this central to your day-to-day activities? 

    Topic 2 - What were the biggest hurdles you had to overcome before you trusted the system and started letting it have ownership over tasks and workflows?

    Topic 3 - What are some of your best practices in terms of machine setup, how or where you store data, how you decide what to give it access to? Walk me through your thoughts around things like keeping things simple, where to be complex, how you think about security, etc.

    Topic 4 - How are you learning to give it more responsibilities, or just figure out new ways to be productive with it? 

    • Good resources you’re pulling from? 
    • Any tips to make it use less tokens?
    • Skills marketplaces?

    Topic 5 - What have been some of the biggest barriers to successful adoption, or just areas where you’re still struggling to get it to do the things you want? Or are you still in the learning curve stage and things just keep growing on one another?

    Topic 6 - If you took the knowledge and skills you have now in Claude-life into your day-job, how do you see yourself working, as well as working with the rest of your team/teams? Would it bother you if you didn’t think they were using AI tools as much? 


    FEEDBACK?

    25 March 2026, 5:00 am
  • 28 minutes
    NVIDIA’s Open Software Trap: The Real Cost of the New Inference Stack

    SUMMARY: We dig into the NVIDIA GTC keynote and highlight three things - accelerated computing for everything, the complexity of the new inference stack, and NVIDIA’s “open” software stack including NemoClaw.

    SHOW: 1012

    SHOW TRANSCRIPT: The Reasoning Show #1012 Transcript

    SHOW VIDEO: https://youtu.be/aXOr91q76yM

    SHOW SPONSORS:

    • VENTION - Ready for expert developers who actually deliver?
      Visit ventionteams.com

    SHOW NOTES:


    Topic 1 - Jensen’s trying to paint the bigger picture of accelerated computing everywhere (robotics, autonomous driving, gen-ai, physical ai - but also just everyday enterprise apps). Everything is about keeping the stock price up, and margins high. The stock price provides the warchest to fight off all foes. 

    Topic 2 - The inference architecture is a complex mix of GPUs, CPUs, ASICs/LPUs, high-speed networking and seems very different from the training architecture. How big is the burden on data center providers? What are the inference alternatives emerging? 

    Topic 3 - Jensen talked a lot about OpenClaw and eventually about NVIDIA’s NemoClaw. How does his interest in Agentic AI tie into his interest in building NVIDIA’s own frontier model


    FEEDBACK?

    22 March 2026, 5:00 am
  • More Episodes? Get the App