Google SRE Prodcast

MP English, Viv, Salim Virji

  • 33 minutes 40 seconds
    SRE in the Retail and Gaming Worlds with Jordan Chernev & Scott Bowers

    Guests Jordan Chernev (Senior Technology Executive) and Scott Bowers (SRE, Gearbox Software) who hail from the retail and gaming industries, respectively, join hosts Steve McGhee and Jordan Greenberg  to discuss the unique challenges of Site Reliability Engineering in their industries. They share the importance of aligning SLOs with user experience, strategies for handling spikes in traffic, communicating with users during outages, and investing in reliability.

    16 October 2024, 1:00 pm
  • 43 minutes 53 seconds
    Incident Response with Sarah Butt and Vrai Stacey

    Sarah Butt (Principal Engineer, Centralized Incident Response, Salesforce) and Vrai Stacey (Staff Software Engineer, Google) join hosts Steve McGhee and Jordan Greenberg to dive into incident response—particularly tooling and software for reliability incidents. Tune in for an in-depth discussion on topics such as the importance of communication and collaboration during incidents, and the role of tooling in supporting incident response processes. Sarah and Vrai also share personal takeaways from incidents they have experienced.

    9 October 2024, 1:00 pm
  • 42 minutes 6 seconds
    Building Reliable Systems with Silvia Botros and Niall Murphy

    Silvia Botros (SRE Architect, Twilio | Author of "High Performance MySQL, 4th edition”) and Niall Murphy (Co-founder & CEO, Stanza) join hosts Steve McGhee and Jordan Greenberg, to discuss cultural shifts in database engineering, rate limiting, load shedding, holistic approaches to reliability, proactive measures to build customer trust, and much more!

    2 October 2024, 1:00 pm
  • 28 minutes 40 seconds
    Creating Systems that are Safe with Liz Fong-Jones

    Liz Fong-Jones (former Google SRE and current Field CTO at honeycomb.io) joins hosts Steve McGhee and Jordan Greenberg for a lively discussion centered around observability, its evolution from monitoring, and its role in modern software development. Tune in for more on the importance of observability as a spectrum, the evolving role of SREs, and advice to aspiring software engineers.

    25 September 2024, 1:00 pm
  • 31 minutes 21 seconds
    Production Problems Are For All! with Ben Treynor Sloss

    Ben Treynor Sloss (VP of Engineering, Google) joins hosts Steve McGhee and Dr. Jennifer Petoff (Director of Technical Infrastructure Education, Google) to share the evolution of SRE and its impact on software development, how AI and ML significantly impacts SRE practices, and the future of SRE.

    Ben coined the term "Site Reliability Engineering" for his team of (now) 4,000 software engineers, engaged in what were traditionally operations functions. Under Ben's leadership, Google SRE wrote two best-selling books on SRE. Since then, the rest of the SaaS industry has come to adopt the SRE name, mission, and practices. 

    18 September 2024, 10:00 am
  • 26 minutes 14 seconds
    There Remains a Huge Amount of Work to Do, with Healfdene Goguen

    In this episode, Healfdene Goguen (Principal Engineer, Google) joins hosts Steve McGhee and Jordan Greenberg to discuss the vast amount of work to be done by SREs, and the fascinating challenges to tackle with clear real-world implications. It's a truly exciting time to be an SRE at Google!

    11 September 2024, 9:00 am
  • 41 minutes 2 seconds
    SRE, a Basis of Influence, with Amy Tobey & Vladyslav Ukis

    In this season of Google Prodcast, current and former SREs, both within and outside of Google, chat with hosts Steve McGhee and Jordan Greenberg to discuss software systems designed and built by SREs. 

    For "episode zero", guests Amy Tobey (Live Services SRE, Netflix) and Dr. Vladyslav Ukis (Head of R&D, Siemens Healthineers, Author of "Establishing SRE Foundations") will set the stage for the season with a lively discussion about what Software Engineering means to Site Reliability Engineering.

    4 September 2024, 9:00 am
  • 46 minutes 32 seconds
    Life of An SRE: Life after Google SRE, with Carla Geisser, Cody Smith, and Laura Nolan

    Former Google SREs, or "Xooglers", talk with hosts MP and Steve McGhee about site reliability engineering outside of Google. What’s the difference in scale? What skills are generally valuable? And why can’t you build “SRE in a box” that jump-starts pretty much any organization?

    Join Carla Geisser, Cody Smith, and Laura Nolan in their lively conversation about what SRE skills and knowledge they have found useful in roles outside of Google. 

    7 November 2023, 5:01 am
  • 51 minutes 11 seconds
    Life of An SRE with Sabrina Farmer

    Sabrina Farmer, VP of Engineering at Google, talks about her career journey through Site Reliability Engineering.  What does management mean? What’s involved in being an effective manager? and what’s a feasibility study? Hear some great advice on how to get what you expect out of a role, wherever on the ladder it is. 

    31 October 2023, 5:01 am
  • 29 minutes 44 seconds
    Life of An SRE with Dave Reisner

    Dave Reisner talks about his path to Staff SRE, from ArchLinux contributor through DevOps to software engineer. This episode emphasizes the value of strong mentoring and manager relationships, and the challenges of work-life balance.

     

    17 October 2023, 5:01 am
  • 32 minutes 4 seconds
    Life of an SRE with Stephen Benjamin

    Explore the role and responsibilities of an SRE manager with Stephen Benjamin.

    10 October 2023, 5:01 am
  • More Episodes? Get the App
© MoonFM 2024. All rights reserved.