In this episode of Search Off the Record, Gary and Martin dig into what "page size" and "page weight" actually mean for developers, users, and search engines.
They discuss exploding web page sizes: median mobile homepages hit 2.3 MB in 2025 Web Almanac (up 3x from 2015), key insights for developers on page weight definitions, Googlebot's crawl limits, HTML bloat from structured data/images, and why size still hurts UX on slow connections despite faster networks.
If you build or maintain websites, this conversation will help you rethink how much data your pages ship, where bloat really comes from, and why page weight still matters even as connections get faster.
Resources: Web Almanac → https://almanac.httparchive.org/en/2025/ HTML living standard → https://html.spec.whatwg.org/multipage/ How page speed helps with conversions → https://www.thinkwithgoogle.com/marketing-strategies/app-and-mobile/mobile-page-speed-data/
Episode transcript → https://goo.gle/sotr106-transcript
Listen to more Search Off the Record → https://goo.gle/sotr-yt Subscribe to Google Search Channel → https://goo.gle/SearchCentral
Search Off the Record is a podcast series that takes you behind the scenes of Google Search with the Search Relations team.
#SOTRpodcast #SEO #GoogleSearch
Speakers: Martin Splitt, Gary Illyes
Developers often talk about Googlebot as if it were a single program you could just run as "googlebot.exe", but that is not how Google's crawling actually works. In this episode of Search Off the Record, Martin and Gary from the Search Relations team unpack how Google's crawling infrastructure is really built and operated. They cover why "Googlebot" is a misnomer and how it relates to a central crawling software-as-a-service used by many Google products, how crawl behavior is controlled centrally to avoid overwhelming sites (throttling, handling 503s, and "don't break the internet" safeguards) and more! If you build for the web, work on SEO, or just want a more accurate mental model of how Google crawls pages, this behind‑the‑scenes discussion is for you.
Resources: Crawlers → https://developer.google.com/crawling
Episode transcript → https://goo.gle/sotr107-transcript
Listen to more Search Off the Record → https://goo.gle/sotr-yt
Subscribe to Google Search Channel → https://goo.gle/SearchCentral
Search Off the Record is a podcast series that takes you behind the scenes of Google Search with the Search Relations team.
#SOTRpodcast #SEO #GoogleSearch
Speakers: Martin Splitt, Gary Illyes
Martin and Gary unpack how HTML parsing really works, why the HTML standard is so lenient, and how messy markup can silently break key SEO signals like hreflang and rel=canonical. They revisit validators and cross‑browser hacks from the Netscape/IE days, and discuss whether semantic HTML and strict validity truly matter for search. You'll also hear when link hints like preload, prefetch, and DNS prefetch help performance (and indirectly SEO), and where meta and link tags really belong.
Resources:
HTML Living Standard → https://html.spec.whatwg.org/
Episode transcript → https://goo.gle/sotr105-transcript
Listen to more Search Off the Record → https://goo.gle/sotr-yt Subscribe to Google Search Channel → https://goo.gle/SearchCentral
Search Off the Record is a podcast series that takes you behind the scenes of Google Search with the Search Relations team.
#SOTRpodcast #SEO #GoogleSearch
Speakers: Martin Splitt, Gary Illyes
In this episode of Search Off the Record, Martin and Gary from the Google Search Relations team tackle a deceptively simple question: do you still need a website in 2026? Starting from the recurring industry claim that "the web is dead," they explore how the web has evolved through the rise of apps, AI chatbots, and social platforms, and why the answer almost always ends up being "it depends." Tune in for an engaging discussion on how websites remain relevant and what it means for content creation and discovery.
Episode transcript → https://goo.gle/sotr103-transcript
Listen to more Search Off the Record → https://goo.gle/sotr-yt
Subscribe to Google Search Channel → https://goo.gle/SearchCentral
Search Off the Record is a podcast series that takes you behind the scenes of Google Search with the Search Relations team.
#SOTRpodcast #SEO #GoogleSearch
Speakers: Martin Splitt, Gary Illyes
Join Martin and Gary as they dive into Search Off the Record's Episode 103, unpacking the 2025 Year-End report on crawling issues. Discover fascinating insights on faceted navigation, action parameters, irrelevant parameters and more, highlighting the biggest challenges faced by web crawlers last year. With humor and expert analysis, this episode reveals critical takeaways for webmasters and SEO professionals. Don't miss valuable tips to enhance your site's crawl efficiency!
Resources:
URL structure best practices for Google Search → https://developers.google.com/search/docs/crawling-indexing/url-structure
Crawling December: Faceted navigation →
https://developers.google.com/search/blog/2024/12/crawling-december-faceted-nav
Xkcd (programmer humor) →