Google Search Ranking System Deconstructed: Insights from DOJ Deposition

A recently released U.S. Justice Department document details the deposition of a Google engineer, offering a glimpse into Google's search ranking system. This deposition provides valuable insights for SEO professionals and anyone interested in understanding how Google ranks websites.

Hand-Crafted Signals: The ABCs of Ranking

Google uses "hand-crafted" signals, meaning algorithms tuned by engineers. These signals combine data from quality raters, user clicks, and other sources to generate a ranking score. The deposition highlights three key signals, referred to as "ABC":

  • A - Anchors: Links pointing to the target page.
  • B - Body: Search query terms within the page content.
  • C - Clicks: User dwell time before returning to the search results page.

These ABC signals contribute to a page's topicality score, indicating its relevance to a specific search query. However, Google's ranking process is far more complex, involving hundreds of algorithms.

ABC signals are the key components of topicality (or a base score), which is Google’s determination of how the document is relevant to the query.

Page Quality: A Static Factor

The deposition reveals that page quality is generally a static factor, independent of the specific query. A high-quality, trustworthy page is considered trustworthy across all related queries. However, relevance signals related to the query are also used to calculate the final ranking.

Quality score is hugely important even today. Page quality is something people complain about the most.

Interestingly, the engineer notes that AI can sometimes negatively impact perceived quality, leading to user complaints.

eDeepRank and LLM-Based Ranking

The deposition also mentions "eDeepRank," an LLM-based system using BERT (a language model). eDeepRank aims to decompose LLM-based signals, making them more transparent and understandable for engineers.

eDeepRank is an LLM system that uses BERT, transformers. Essentially, eDeepRank tries to take LLM-based signals and decompose them into components to make them more transparent.

PageRank and Link Distance

PageRank, Google's original ranking innovation, is linked to distance ranking algorithms. These algorithms calculate the distance from authoritative "seed sites" to other websites on the same topic. Sites closer to the seed sites are considered more authoritative.

PageRank. This is a single signal relating to distance from a known good source, and it is used as an input to the Quality score.

A Mysterious Chrome-Based Popularity Signal

The deposition mentions a redacted popularity signal using Chrome data. This has fueled speculation about the role of Chrome metrics in ranking.

[redacted] (popularity) signal that uses Chrome data.

Key Takeaways

This DOJ deposition offers valuable insights into Google's complex ranking system. It highlights the importance of hand-crafted signals, static page quality, the role of AI, and the use of Chrome data. While the deposition provides a general overview, it doesn't reveal the specific formulas or thresholds used in Google's algorithms.