Data Engineering Podcast

Tobias Macey

This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

  • 26 minutes 59 seconds
    From Data Engineering to AI Engineering: Where the Lines Blur
    Summary 
    In this solo episode of the Data Engineering Podcast, host Tobias Macey reflects on how AI has transformed the practice and pace of data engineering over time. Starting from its origins in the Hadoop and cloud warehouse era, he explores the discipline's evolution through ML engineering and MLOps to today's blended boundaries between data, ML, and AI engineering. The conversation covers how unstructured data is becoming more prominent, vectors and knowledge graphs are emerging as key components, and reliability expectations are changing due to interactive user-facing AI. The host also delves into process changes, including tighter collaboration, faster dataset onboarding, new governance and access controls, and the importance of treating experimentation and evaluation as fundamental testing practices. 

    Announcements 
    • Hello and welcome to the Data Engineering Podcast, the show about modern data management
    • Data teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.
    • Composable data infrastructure is great, until you spend all of your time gluing it together. Bruin is an open source framework, driven from the command line, that makes integration a breeze. Write Python and SQL to handle the business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. Bruin allows you to build end-to-end data workflows using AI, has connectors for hundreds of platforms, and helps data teams deliver faster. Teams that use Bruin need less engineering effort to process data and benefit from a fully integrated data platform. Go to dataengineeringpodcast.com/bruin today to get started. And for dbt Cloud customers, they'll give you $1,000 credit to migrate to Bruin Cloud.
    • Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. 
    • You’re a developer who wants to innovate—instead, you’re stuck fixing bottlenecks and fighting legacy code. MongoDB can help. It’s a flexible, unified platform that’s built for developers, by developers. MongoDB is ACID compliant, Enterprise-ready, with the capabilities you need to ship AI apps—fast. That’s why so many of the Fortune 500 trust MongoDB with their most critical workloads. Ready to think outside rows and columns? Start building at MongoDB.com/Build
    • Your host is Tobias Macey and today I'm interviewing reflecting about the increasingly blurry boundaries between data engineering and AI engineering
    Interview
    • Introduction
    • I started this podcast in 2017, right when the term "Data Engineer" was becoming widely used for a specific job title with a reasonably well-understood set of responsibilities. This was in response to the massive hype around "data science" and consequent hiring sprees that characterized the mid-2000s to mid-2010s. The introduction of generative AI and AI Engineering to the technical ecosystem is changing the scope of responsibilities for data engineers and other data practitioners. Of note is the fact that:
    • AI models can be used to process unstructured data sources into structured data assets
    • AI applications require new types of data assets
    • The SLAs for data assets related to AI serving are different from BI/warehouse use cases
    • The technology stacks for AI applications aren't necessarily the same as for analytical data pipelines
    • Because everything is so new there is not a lot of prior art, and the prior art that does exist isn't necessarily easy to find because of differences in terminology
    • Experimentation has moved from being just an MLOps capability into being a core need for organizations
    Contact Info
    • Email
    Parting Question
    • From your perspective, what is the biggest gap in the tooling or technology for data management today?
    Links
    The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
     
    14 December 2025, 9:20 pm
  • 58 minutes 48 seconds
    Malloy: Hierarchical Data, Semantic Models, and the Future of Analytics
    Summary 
    In this episode Michael Toy, co-creator of Malloy, talks about rethinking how we work with data beyond SQL. Michael shares the origins of Malloy from his and Lloyd Tabb’s experience at Looker, why SQL’s mental model often fights human problem solving, and how Malloy aims to be a composable, maintainable language that treats SQL as the assembly layer rather than something humans should write. He explores Malloy’s core ideas — semantic modeling tightly coupled with a query language, hierarchical data as the default mental model, and preserving context so analysis stays interactive and open-ended. He also digs into the developer experience and ecosystem: Malloy’s TypeScript implementation, VS Code integration, CLI, emerging notebook support, and how Malloy can sit alongside or replace parts of existing transformation workflows. Michael discusses practical trade-offs in language design, the surprising fit for LLM-generated queries, and near-term roadmap areas like dimensional filtering, better aggregation strategies across levels, and closing gaps that still require escaping to SQL. He closes with an invitation to contribute to the open-source project and help shape its evolution. 

    Announcements 
    • Hello and welcome to the Data Engineering Podcast, the show about modern data management
    • Data teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.
    • Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. 
    • Composable data infrastructure is great, until you spend all of your time gluing it together. Bruin is an open source framework, driven from the command line, that makes integration a breeze. Write Python and SQL to handle the business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. Bruin allows you to build end-to-end data workflows using AI, has connectors for hundreds of platforms, and helps data teams deliver faster. Teams that use Bruin need less engineering effort to process data and benefit from a fully integrated data platform. Go to dataengineeringpodcast.com/bruin today to get started. And for dbt Cloud customers, they'll give you $1,000 credit to migrate to Bruin Cloud.
    • You’re a developer who wants to innovate—instead, you’re stuck fixing bottlenecks and fighting legacy code. MongoDB can help. It’s a flexible, unified platform that’s built for developers, by developers. MongoDB is ACID compliant, Enterprise-ready, with the capabilities you need to ship AI apps—fast. That’s why so many of the Fortune 500 trust MongoDB with their most critical workloads. Ready to think outside rows and columns? Start building at MongoDB.com/Build
    • Your host is Tobias Macey and today I'm interviewing Michael Toy about Malloy, a modern language for building composable and maintainable analytics and data models on relational engines

    Interview
     
    • Introduction
    • How did you get involved in the area of data management?
    • Can you describe what Malloy is and the story behind it?
      • What is the core problem that you are trying to solve with Malloy?
    • There are countless projects that aim to reimagine/reinvent/replace SQL. What are the factors that make Malloy stand out in your mind?
    • Who are the target personas for the Malloy language?
    • One of the key success factors for any language is the ecosystem around it and the integrations available to it. How does Malloy fit in the toolchains and workflows for data engineers and analysts?
    • Can you describe the key design and syntax elements of Malloy?
      • How have the scope and focus of the language evolved since you first started working on it?
    • How do the structure and semantics of Malloy change the ways that teams think about their data models?
    • SQL-focused tools have gained prominence as the means of building the tranfromation stage of data pipelines. How would you characterize the capabilities of Malloy as a tool for building translation pipelines?
    • What are the most interesting, innovative, or unexpected ways that you have seen Malloy used?
    • What are the most interesting, unexpected, or challenging lessons that you have learned while working on Malloy?
    • When is Malloy the wrong choice?
    • What do you have planned for the future of Malloy?

    Contact Info
    Parting Question
    • From your perspective, what is the biggest gap in the tooling or technology for data management today?
    Links

    The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
     
    8 December 2025, 12:41 am
  • 1 hour 57 seconds
    Blurring Lines: Data, AI, and the New Playbook for Team Velocity
    Summary
    In this crossover episode, Max Beauchemin explores how multiplayer, multi‑agent engineering is transforming the way individuals and teams build data and AI systems. He digs into the shifting boundary between data and AI engineering, the rise of “context as code,” and how just‑in‑time retrieval via MCP and CLIs lets agents gather what they need without bloating context windows. Max shares hard‑won practices from going “AI‑first” for most tasks, where humans focus on orchestration and taste, and the new bottlenecks that appear — code review, QA, async coordination — when execution accelerates 2–10x. He also dives deep into Agor, his open‑source agent orchestration platform: a spatial, multiplayer workspace that manages Git worktrees and live dev environments, templatizes prompts by workflow zones, supports session forking and sub‑sessions, and exposes an internal MCP so agents can schedule, monitor, and even coordinate other agents.

    Announcements
    • Hello and welcome to the Data Engineering Podcast, the show about modern data management
    • Data teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.
    • Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.
    • Composable data infrastructure is great, until you spend all of your time gluing it together. Bruin is an open source framework, driven from the command line, that makes integration a breeze. Write Python and SQL to handle the business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. Bruin allows you to build end-to-end data workflows using AI, has connectors for hundreds of platforms, and helps data teams deliver faster. Teams that use Bruin need less engineering effort to process data and benefit from a fully integrated data platform. Go to dataengineeringpodcast.com/bruin today to get started. And for dbt Cloud customers, they'll give you $1,000 credit to migrate to Bruin Cloud.
    • Your host is Tobias Macey and today I'm interviewing Maxime Beauchemin about the impact of multi-player multi-agent engineering on individual and team velocity for building better data systems
    Interview
    • Introduction
    • How did you get involved in the area of data management?
    • Can you start by giving an overview of the types of work that you are relying on AI development agents for?
    • As you bring agents into the mix for software engineering, what are the bottlenecks that start to show up?
    • In my own experience there are a finite number of agents that I can manage in parallel. How does Agor help to increase that limit?
    • How does making multi-agent management a multi-player experience change the dynamics of how you apply agentic engineering workflows?
    Contact Info
    Links
    The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
    24 November 2025, 12:51 am
  • 51 minutes 46 seconds
    State, Scale, and Signals: Rethinking Orchestration with Durable Execution
    Summary 
    In this episode Preeti Somal, EVP of Engineering at Temporal, talks about the durable execution model and how it reshapes the way teams build reliable, stateful systems for data and AI. She explores Temporal’s code‑first programming model—workflows, activities, task queues, and replay—and how it eliminates hand‑rolled retry, checkpoint, and error‑handling scaffolding while letting data remain where it lives. Preeti shares real-world patterns for replacing DAG-first orchestration, integrating application and data teams through signals and Nexus for cross-boundary calls, and using Temporal to coordinate long-running, human-in-the-loop, and agentic AI workflows with full observability and auditability. Shee also discusses heuristics for choosing Temporal alongside (or instead of) traditional orchestrators, managing scale without moving large datasets, and lessons from running durable execution as a cloud service. 

    Announcements 
    • Hello and welcome to the Data Engineering Podcast, the show about modern data management
    • Data teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.
    • Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. 
    • Composable data infrastructure is great, until you spend all of your time gluing it together. Bruin is an open source framework, driven from the command line, that makes integration a breeze. Write Python and SQL to handle the business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. Bruin allows you to build end-to-end data workflows using AI, has connectors for hundreds of platforms, and helps data teams deliver faster. Teams that use Bruin need less engineering effort to process data and benefit from a fully integrated data platform. Go to dataengineeringpodcast.com/bruin today to get started. And for dbt Cloud customers, they'll give you $1,000 credit to migrate to Bruin Cloud.
    • Your host is Tobias Macey and today I'm interviewing Preeti Somal about how to incorporate durable execution and state management into AI application architectures

    Interview
     
    • Introduction
    • How did you get involved in the area of data management?
    • Can you describe what durable execution is and how it impacts system architecture?
    • With the strong focus on state maintenance and high reliability, what are some of the most impactful ways that data teams are incorporating tools like Temporal into their work?
    • One of the core primitives in Temporal is a "workflow". How does that compare to similar primitives in common data orchestration systems such as Airflow, Dagster, Prefect, etc.?
       
      • What are the heuristics that you recommend when deciding which tool to use for a given task, particularly in data/pipeline oriented projects?
    •  
    • Even if a team is using a more data-focused orchestration engine, what are some of the ways that Temporal can be applied to handle the processing logic of the actual data?
    • AI applications are also very dependent on reliable data to be effective in production contexts. What are some of the design patterns where durable execution can be integrated into RAG/agent applications?
    • What are some of the conceptual hurdles that teams experience when they are starting to adopt Temporal or other durable execution frameworks?
    • What are the most interesting, innovative, or unexpected ways that you have seen Temporal/durable execution used for data/AI services?
    • What are the most interesting, unexpected, or challenging lessons that you have learned while working on Temporal?
    • When is Temporal/durable execution the wrong choice?
    • What do you have planned for the future of Temporal for data and AI systems?

    Contact Info
     

    Parting Question
     
    • From your perspective, what is the biggest gap in the tooling or technology for data management today?

    Closing Announcements
     
    • Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.
    • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
    • If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.

    Links
     

    The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
     
    16 November 2025, 11:19 pm
  • 51 minutes 35 seconds
    The AI Data Paradox: High Trust in Models, Low Trust in Data
    Summary
    In this episode of the Data Engineering Podcast Ariel Pohoryles, head of product marketing for Boomi's data management offerings, talks about a recent survey of 300 data leaders on how organizations are investing in data to scale AI. He shares a paradox uncovered in the research: while 77% of leaders trust the data feeding their AI systems, only 50% trust their organization's data overall. Ariel explains why truly productionizing AI demands broader, continuously refreshed data with stronger automation and governance, and highlights the challenges posed by unstructured data and vector stores. The conversation covers the need to shift from manual reviews to automated pipelines, the resurgence of metadata and master data management, and the importance of guardrails, traceability, and agent governance. Ariel also predicts a growing convergence between data teams and application integration teams and advises leaders to focus on high-value use cases, aggressive pipeline automation, and cataloging and governing the coming sprawl of AI agents, all while using AI to accelerate data engineering itself.

    Announcements
    • Hello and welcome to the Data Engineering Podcast, the show about modern data management
    • Data teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.
    • Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.
    • Composable data infrastructure is great, until you spend all of your time gluing it together. Bruin is an open source framework, driven from the command line, that makes integration a breeze. Write Python and SQL to handle the business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. Bruin allows you to build end-to-end data workflows using AI, has connectors for hundreds of platforms, and helps data teams deliver faster. Teams that use Bruin need less engineering effort to process data and benefit from a fully integrated data platform. Go to dataengineeringpodcast.com/bruin today to get started. And for dbt Cloud customers, they'll give you $1,000 credit to migrate to Bruin Cloud.
    • Your host is Tobias Macey and today I'm interviewing Ariel Pohoryles about data management investments that organizations are making to enable them to scale AI implementations
    Interview
    • Introduction
    • How did you get involved in the area of data management?
    • Can you start by describing the motivation and scope of your recent survey on data management investments for AI across your respondents?
      • What are the key takeaways that were most significant to you?
    • The survey reveals a fascinating paradox: 77% of leaders trust the data used by their AI systems, yet only half trust their organization's overall data quality. For our data engineering audience, what does this suggest about how companies are currently sourcing data for AI? 
      • Does it imply they are using narrow, manually-curated "golden datasets," and what are the technical challenges and risks of that approach as they try to scale?
    • The report highlights a heavy reliance on manual data quality processes, with one expert noting companies feel it's "not reliable to fully automate validation" for external or customer data. At the same time, maturity in "Automated tools for data integration and cleansing" is low, at only 42%. What specific technical hurdles or organizational inertia are preventing teams from adopting more automation in their data quality and integration pipelines?
    • There was a significant point made that with generative AI, "biases can scale much faster," making automated governance essential. From a data engineering perspective, how does the data management strategy need to evolve to support generative AI versus traditional ML models? 
      • What new types of data quality checks, lineage tracking, or monitoring for feedback loops are required when the model itself is generating new content based on its own outputs?
    • The report champions a "centralized data management platform" as the "connective tissue" for reliable AI. How do you see the scale and data maturity impacting the realities of that effort?
      • How do architectural patterns in the shape of cloud warehouses, lakehouses, data mesh, data products, etc. factor into that need for centralized/unified platforms?
    • A surprising finding was that a third of respondents have not fully grasped the risk of significant inaccuracies in their AI models if they fail to prioritize data management. In your experience, what are the biggest blind spots for data and analytics leaders?
    • Looking at the maturity charts, companies rate themselves highly on "Developing a data management strategy" (65%) but lag significantly in areas like "Automated tools for data integration and cleansing" (42%) and "Conducting bias-detection audits" (24%). If you were advising a data engineering team lead based on these findings, what would you tell them to prioritize in the next 6-12 months to bridge the gap between strategy and a truly scalable, trustworthy data foundation for AI?
    • The report states that 83% of companies expect to integrate more data sources for their AI in the next year. For a data engineer on the ground, what is the most important capability they need to build into their platform to handle this influx?
    • What are the most interesting, innovative, or unexpected ways that you have seen teams addressing the new and accelerated data needs for AI applications?
    • What are some of the noteworthy trends or predictions that you have for the near-term future of the impact that AI is having or will have on data teams and systems?
    Contact Info
    Parting Question
    • From your perspective, what is the biggest gap in the tooling or technology for data management today?
    Closing Announcements
    • Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.
    • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
    • If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.
    Links
    The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
    9 November 2025, 11:53 pm
  • 50 minutes 40 seconds
    Bridging the AI–Data Gap: Collect, Curate, Serve
    Summary
    In this episode of the Data Engineering Podcast Omri Lifshitz (CTO) and Ido Bronstein (CEO) of Upriver talk about the growing gap between AI's demand for high-quality data and organizations' current data practices. They discuss why AI accelerates both the supply and demand sides of data, highlighting that the bottleneck lies in the "middle layer" of curation, semantics, and serving. Omri and Ido outline a three-part framework for making data usable by LLMs and agents: collect, curate, serve, and share challenges of scaling from POCs to production, including compounding error rates and reliability concerns. They also explore organizational shifts, patterns for managing context windows, pragmatic views on schema choices, and Upriver's approach to building autonomous data workflows using determinism and LLMs at the right boundaries. The conversation concludes with a look ahead to AI-first data platforms where engineers supervise business semantics while automation stitches technical details end-to-end.


    Announcements
    • Hello and welcome to the Data Engineering Podcast, the show about modern data management
    • Data teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.
    • Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.
    • Composable data infrastructure is great, until you spend all of your time gluing it together. Bruin is an open source framework, driven from the command line, that makes integration a breeze. Write Python and SQL to handle the business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. Bruin allows you to build end-to-end data workflows using AI, has connectors for hundreds of platforms, and helps data teams deliver faster. Teams that use Bruin need less engineering effort to process data and benefit from a fully integrated data platform. Go to dataengineeringpodcast.com/bruin today to get started. And for dbt Cloud customers, they'll give you $1,000 credit to migrate to Bruin Cloud.
    • Your host is Tobias Macey and today I'm interviewing Omri Lifshitz and Ido Bronstein about the challenges of keeping up with the demand for data when supporting AI systems
    Interview
    • Introduction
    • How did you get involved in the area of data management?
    • We're here to talk about "The Growing Gap Between Data & AI". From your perspective, what is this gap, and why do you think it's widening so rapidly right now?
    • How does this gap relate to the founding story of Upriver? What problems were you and your co-founders experiencing that led you to build this?
    • The core premise of new AI tools, from RAG pipelines to LLM agents, is that they are only as good as the data they're given. How does this "garbage in, garbage out" problem change when the "in" is not a static file but a complex, high-velocity, and constantly changing data pipeline?
    • Upriver is described as an "intelligent agent system" and an "autonomous data engineer." This is a fascinating "AI to solve for AI" approach. Can you describe this agent-based architecture and how it specifically works to bridge that data-AI gap?
    • Your website mentions a "Data Context Layer" that turns "tribal knowledge" into a "machine-usable mode." This sounds critical for AI. How do you capture that context, and how does it make data "AI-ready" in a way that a traditional data catalog or quality tool doesn't?
    • What are the most innovative or unexpected ways you've seen companies trying to make their data "AI-ready"? And where are the biggest points of failure you observe?
    • What has been the most challenging or unexpected lesson you've learned while building an AI system (Upriver) that is designed to fix the data foundation for other AI systems?
    • When is an autonomous, agent-based approach not the right solution for a team's data quality problems? What organizational or technical maturity is required to even start closing this data-AI gap?
    • What do you have planned for the future of Upriver? And looking more broadly, how do you see this gap between data and AI evolving over the next few years?
    Contact Info
    Parting Question
    • From your perspective, what is the biggest gap in the tooling or technology for data management today?
    Closing Announcements
    • Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.
    • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
    • If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.
    Links
    The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
    2 November 2025, 7:31 pm
  • 1 hour 5 minutes
    Beyond the Perimeter: Practical Patterns for Fine‑Grained Data Access
    Summary
    In this episode of the Data Engineering Podcast Matt Topper, president of UberEther, talks about the complex challenge of identity, credentials, and access control in modern data platforms. With the shift to composable ecosystems, integration burdens have exploded, fracturing governance and auditability across warehouses, lakes, files, vector stores, and streaming systems. Matt shares practical solutions, including propagating user identity via JWTs, externalizing policy with engines like OPA/Rego and Cedar, and using database proxies for native row/column security. He also explores catalog-driven governance, lineage-based label propagation, and OpenTDF for binding policies to data objects. The conversation covers machine-to-machine access, short-lived credentials, workload identity, and constraining access by interface choke points, as well as lessons from Zanzibar-style policy models and the human side of enforcement. Matt emphasizes the need for trust composition - unifying provenance, policy, and identity context - to answer questions about data access, usage, and intent across the entire data path.

    Announcements
    • Hello and welcome to the Data Engineering Podcast, the show about modern data management
    • Data teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.
    • Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.
    • Composable data infrastructure is great, until you spend all of your time gluing it together. Bruin is an open source framework, driven from the command line, that makes integration a breeze. Write Python and SQL to handle the business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. Bruin allows you to build end-to-end data workflows using AI, has connectors for hundreds of platforms, and helps data teams deliver faster. Teams that use Bruin need less engineering effort to process data and benefit from a fully integrated data platform. Go to dataengineeringpodcast.com/bruin today to get started. And for dbt Cloud customers, they'll give you $1,000 credit to migrate to Bruin Cloud.
    • Your host is Tobias Macey and today I'm interviewing Matt Topper about the challenges of managing identity and access controls in the context of data systems
    Interview
    • Introduction
    • How did you get involved in the area of data management?
    • The data ecosystem is a uniquely challenging space for creating and enforcing technical controls for identity and access control. What are the key considerations for designing a strategy for addressing those challenges?
    • For data acess the off-the-shelf options are typically on either extreme of too coarse or too granular in their capabilities. What do you see as the major factors that contribute to that situation?
    • Data governance policies are often used as the primary means of identifying what data can be accesssed by whom, but translating that into enforceable constraints is often left as a secondary exercise. How can we as an industry make that a more manageable and sustainable practice?
    • How can the audit trails that are generated by data systems be used to inform the technical controls for identity and access?
    • How can the foundational technologies of our data platforms be improved to make identity and authz a more composable primitive?
    • How does the introduction of streaming/real-time data ingest and delivery complicate the challenges of security controls?
    • What are the most interesting, innovative, or unexpected ways that you have seen data teams address ICAM?
    • What are the most interesting, unexpected, or challenging lessons that you have learned while working on ICAM?
    • What are the aspects of ICAM in data systems that you are paying close attention to?
      • What are your predictions for the industry adoption or enforcement of those controls?
    Contact Info
    Parting Question
    • From your perspective, what is the biggest gap in the tooling or technology for data management today?
    Closing Announcements
    • Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.
    • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
    • If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.
    Links
    The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
    27 October 2025, 1:32 am
  • 1 hour 4 minutes
    The True Costs of Legacy Systems: Technical Debt, Risk, and Exit Strategies
    Summary
    In this episode Kate Shaw, Senior Product Manager for Data and SLIM at SnapLogic, talks about the hidden and compounding costs of maintaining legacy systems—and practical strategies for modernization. She unpacks how “legacy” is less about age and more about when a system becomes a risk: blocking innovation, consuming excess IT time, and creating opportunity costs. Kate explores technical debt, vendor lock-in, lost context from employee turnover, and the slippery notion of “if it ain’t broke,” especially when data correctness and lineage are unclear. Shee digs into governance, observability, and data quality as foundations for trustworthy analytics and AI, and why exit strategies for system retirement should be planned from day one. The discussion covers composable architectures to avoid monoliths and big-bang migrations, how to bridge valuable systems into AI initiatives without lock-in, and why clear success criteria matter for AI projects. Kate shares lessons from the field on discovery, documentation gaps, parallel run strategies, and using integration as the connective tissue to unlock data for modern, cloud-native and AI-enabled use cases. She closes with guidance on planning migrations, defining measurable outcomes, ensuring lineage and compliance, and building for swap-ability so teams can evolve systems incrementally instead of living with a “bowl of spaghetti.”

    Announcements
    • Hello and welcome to the Data Engineering Podcast, the show about modern data management
    • Data teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.
    • Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.
    • Your host is Tobias Macey and today I'm interviewing Kate Shaw about the true costs of maintaining legacy systems
    Interview
    • Introduction
    • How did you get involved in the area of data management?
    • What are your crtieria for when a given system or service transitions to being "legacy"?
    • In order for any service to survive long enough to become "legacy" it must be serving its purpose and providing value. What are the common factors that prompt teams to deprecate or migrate systems?
    • What are the sources of monetary cost related to maintaining legacy systems while they remain operational?
    • Beyond monetary cost, economics also have a concept of "opportunity cost". What are some of the ways that manifests in data teams who are maintaining or migrating from legacy systems?
      • How does that loss of productivity impact the broader organization?
    • How does the process of migration contribute to issues around data accuracy, reliability, etc. as well as contributing to potential compromises of security and compliance?
    • Once a system has been replaced, it needs to be retired. What are some of the costs associated with removing a system from service?
    • What are the most interesting, innovative, or unexpected ways that you have seen teams address the costs of legacy systems and their retirement?
    • What are the most interesting, unexpected, or challenging lessons that you have learned while working on legacy systems migration?
    • When is deprecation/migration the wrong choice?
    • How have evolutionary architecture patterns helped to mitigate the costs of system retirement?
    Contact Info
    Parting Question
    • From your perspective, what is the biggest gap in the tooling or technology for data management today?
    Closing Announcements
    • Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.
    • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
    • If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.
    Links
    The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
    18 October 2025, 10:35 pm
  • 51 minutes 58 seconds
    Context Engineering as a Discipline: Building Governed AI Analytics
    Summary
    In this episode of the Data Engineering Podcast, host Tobias Macey welcomes back Nick Schrock, CTO and founder of Dagster Labs, to discuss Compass - a Slack-native, agentic analytics system designed to keep data teams connected with business stakeholders. Nick shares his journey from initial skepticism to embracing agentic AI as model and application advancements made it practical for governed workflows, and explores how Compass redefines the relationship between data teams and stakeholders by shifting analysts into steward roles, capturing and governing context, and integrating with Slack where collaboration already happens. The conversation covers organizational observability through Compass's conversational system of record, cost control strategies, and the implications of agentic collaboration on Conway's Law, as well as what's next for Compass and Nick's optimistic views on AI-accelerated software engineering.

    Announcements
    • Hello and welcome to the Data Engineering Podcast, the show about modern data management
    • Data teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.
    • Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. 
    • Your host is Tobias Macey and today I'm interviewing Nick Schrock about building an AI analyst that keeps data teams in the loop
    Interview
    • Introduction
    • How did you get involved in the area of data management?
    • Can you describe what Compass is and the story behind it?
    • context repository structure
      • how to keep it relevant/avoid sprawl/duplication
    • providing guardrails
    • how does a tool like Compass help provide feedback/insights back to the data teams?
    • preparing the data warehouse for effective introspection by the AI
    • LLM selection
    • cost management
      • caching/materializing ad-hoc queries
    • Why Slack and enterprise chat are important to b2b software
    • How AI is changing stakeholder relationships
    • How not to overpromise AI capabilities 
    • How does Compass relate to BI?
    • How does Compass relate to Dagster and Data Infrastructure?
    • What are the most interesting, innovative, or unexpected ways that you have seen Compass used?
    • What are the most interesting, unexpected, or challenging lessons that you have learned while working on Compass?
    • When is Compass the wrong choice?
    • What do you have planned for the future of Compass?
    Contact Info
    Parting Question
    • From your perspective, what is the biggest gap in the tooling or technology for data management today?
    Closing Announcements
    • Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.
    • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
    • If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.
    Links
    The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
    11 October 2025, 9:36 pm
  • 1 hour 1 minute
    The Data Model That Captures Your Business: Metric Trees Explained
    Summary
    In this episode of the Data Engineering Podcast Vijay Subramanian, founder and CEO of Trace, talks about metric trees - a new approach to data modeling that directly captures a company's business model. Vijay shares insights from his decade-long experience building data practices at Rent the Runway and explains how the modern data stack has led to a proliferation of dashboards without a coherent way for business consumers to reason about cause, effect, and action. He explores how metric trees differ from and interoperate with other data modeling approaches, serve as a backend for analytical workflows, and provide concrete examples like modeling Uber's revenue drivers and customer journeys. Vijay also discusses the potential of AI agents operating on metric trees to execute workflows, organizational patterns for defining inputs and outputs with business teams, and a vision for analytics that becomes invisible infrastructure embedded in everyday decisions.

    Announcements
    • Hello and welcome to the Data Engineering Podcast, the show about modern data management
    • Data teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.
    • Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.
    • Your host is Tobias Macey and today I'm interviewing Vijay Subramanian about metric trees and how they empower more effective and adaptive analytics
    Interview
    • Introduction
    • How did you get involved in the area of data management?
    • Can you describe what metric trees are and their purpose?
    • How do metric trees relate to metric/semantic layers?
    • What are the shortcomings of existing data modeling frameworks that prevent effective use of those assets?
      • How do metric trees build on top of existing investments in dimensional data models?
    • What are some strategies for engaging with the business to identify metrics and their relationships?
    • What are your recommendations for storage, representation, and retrieval of metric trees?
    • How do metric trees fit into the overall lifecycle of organizational data workflows?
    • When creating any new data asset it introduces overhead of maintenance, monitoring, and evolution. How do metric trees fit into existing testing and validation frameworks that teams rely on for dimensional modeling?
      • What are some of the key differences in useful evaluation/testing that teams need to develop for metric trees?
    • How do metric trees assist in context engineering for AI-powered self-serve access to organizational data?
    • What are the most interesting, innovative, or unexpected ways that you have seen metric trees used?
    • What are the most interesting, unexpected, or challenging lessons that you have learned while working on metric trees and operationalizing them at Trace?
    • When is a metric tree the wrong abstraction?
    • What do you have planned for the future of Trace and applications of metric trees?
    Contact Info
    Parting Question
    • From your perspective, what is the biggest gap in the tooling or technology for data management today?
    Closing Announcements
    • Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.
    • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
    • If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.
    Links
    The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
    5 October 2025, 11:59 pm
  • 56 minutes 31 seconds
    From GPUs-as-a-Service to Workloads-as-a-Service: Flex AI’s Path to High-Utilization AI Infra
    Summary
    In this crossover episode of the AI Engineering Podcast, host Tobias Macey interviews Brijesh Tripathi, CEO of Flex AI, about revolutionizing AI engineering by removing DevOps burdens through "workload as a service". Brijesh shares his expertise from leading AI/HPC architecture at Intel and deploying supercomputers like Aurora, highlighting how access friction and idle infrastructure slow progress. Join them as they discuss Flex AI's innovative approach to simplifying heterogeneous compute, standardizing on consistent Kubernetes layers, and abstracting inference across various accelerators, allowing teams to iterate faster without wrestling with drivers, libraries, or cloud-by-cloud differences. Brijesh also shares insights into Flex AI's strategies for lifting utilization, protecting real-time workloads, and spanning the full lifecycle from fine-tuning to autoscaled inference, all while keeping complexity at bay.

    Pre-amble
    I hope you enjoy this cross-over episode of the AI Engineering Podcast, another show that I run to act as your guide to the fast-moving world of building scalable and maintainable AI systems. As generative AI models have grown more powerful and are being applied to a broader range of use cases, the lines between data and AI engineering are becoming increasingly blurry. The responsibilities of data teams are being extended into the realm of context engineering, as well as designing and supporting new infrastructure elements that serve the needs of agentic applications. This episode is an example of the types of work that are not easily categorized into one or the other camp.

    Announcements
    • Hello and welcome to the Data Engineering Podcast, the show about modern data management
    • Data teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.
    • Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. 
    • Your host is Tobias Macey and today I'm interviewing Brijesh Tripathi about FlexAI, a platform offering a service-oriented abstraction for AI workloads
    Interview
    • Introduction
    • How did you get involved in machine learning?
    • Can you describe what FlexAI is and the story behind it?
    • What are some examples of the ways that infrastructure challenges contribute to friction in developing and operating AI applications?
      • How do those challenges contribute to issues when scaling new applications/businesses that are founded on AI?
    • There are numerous managed services and deployable operational elements for operationalizing AI systems. What are some of the main pitfalls that teams need to be aware of when determining how much of that infrastructure to own themselves?
    • Orchestration is a key element of managing the data and model lifecycles of these applications. How does your approach of "workload as a service" help to mitigate some of the complexities in the overall maintenance of that workload?
    • Can you describe the design and architecture of the FlexAI platform?
      • How has the implementation evolved from when you first started working on it?
    • For someone who is going to build on top of FlexAI, what are the primary interfaces and concepts that they need to be aware of?
    • Can you describe the workflow of going from problem to deployment for an AI workload using FlexAI?
    • One of the perennial challenges of making a well-integrated platform is that there are inevitably pre-existing workloads that don't map cleanly onto the assumptions of the vendor. What are the affordances and escape hatches that you have built in to allow partial/incremental adoption of your service?
    • What are the elements of AI workloads and applications that you are explicitly not trying to solve for?
    • What are the most interesting, innovative, or unexpected ways that you have seen FlexAI used?
    • What are the most interesting, unexpected, or challenging lessons that you have learned while working on FlexAI?
    • When is FlexAI the wrong choice?
    • What do you have planned for the future of FlexAI?
    Contact Info
    Parting Question
    • From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?
    Links
    The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
    28 September 2025, 11:46 pm
  • More Episodes? Get the App