New findings indicate that Google Search Console (GSC) data is approximately 75% incomplete, rendering decisions based solely on GSC dangerously unreliable. An analysis of 450 million impressions reveals that Google filters out a significant portion of search data, primarily for privacy reasons, while bot activity and the rise of AI Overviews (AIOs) further corrupt the remaining information.

Google filters 3/4 of search impressions for privacy, while bot inflation and AIOs corrupt what remains.
Google filters 3/4 of search impressions for "privacy," while bot inflation and AIOs corrupt what remains. (Image Credit: Kevin Indig)

GSC: From Ground Truth to Unreliable Data

Google Search Console data was once considered the most accurate reflection of search engine results. However, privacy sampling, bot-inflated impressions, and AI Overview (AIO) distortions have severely compromised its reliability. Without a clear understanding of how this data is filtered and skewed, SEO professionals risk making flawed conclusions.

The journey towards less reliable SEO data began with Google's removal of keyword referrer data and the exclusion of critical SERP Features from performance results. This trend has been exacerbated by three key developments over the past 12 months:

  • January 2025: Google introduced "SearchGuard," a system requiring JavaScript and advanced CAPTCHA for search result access, designed to differentiate human users from scrapers.
  • March 2025: Google significantly increased the deployment of AI Overviews (AIOs) in the SERPs, leading to a notable spike in impressions but a corresponding drop in clicks.
  • September 2025: Google removed the num=100 parameter, a tool previously used by SERP scrapers to parse search results. This action normalized impression spikes, though clicks remained low.

While Google has implemented measures to refine GSC data, the current state leaves more questions than answers regarding its accuracy.

Privacy Sampling Hides 75% Of Queries

Google deliberately filters out a substantial volume of impressions and clicks for "privacy" reasons. A year ago, an analysis by Patrick Stox suggested nearly 50% of data was filtered. A recent, repeated analysis across 10 B2B sites in the USA, encompassing approximately 4 million clicks and 450 million impressions, reveals an even higher filter rate.

Methodology

The analysis leveraged two GSC API endpoints:

  • The aggregate query (without dimensions) provides total clicks and impressions, including all data.
  • The query-level query (with the "query" dimension) returns only queries that meet Google's privacy threshold.

By comparing these two numbers, the filter rate can be calculated. For instance, if aggregate data shows 4,205 clicks but query-level data only displays 1,937 visible clicks, then 2,268 clicks (53.94%) were filtered. This study analyzed 10 B2B SaaS sites over 30-day, 90-day, and 12-month periods, comparing current data against the same analysis from 12 months prior.

Key Findings

1. Google filters out approximately 75% of impressions.

Google filters out ~75% of impressions.
Image Credit: Kevin Indig
  • The impression filter rate is exceptionally high, with three-quarters of data being withheld for privacy.
  • This rate is only 2 percentage points higher than 12 months ago, indicating a persistent issue.
  • Observed filter rates ranged from 59.3% to 93.6%.
Range of GSC impression filtering.
Image Credit: Kevin Indig

2. Google filters out approximately 38% of clicks, a 5% decrease from 12 months ago.

Google filters out ~38% of clicks.
Image Credit: Kevin Indig
  • Click filtering, a less discussed issue, means up to one-third of actual clicks are not reported by Google.
  • This is an improvement from 12 months ago, when over 40% of clicks were filtered.
  • The range of click filtering observed was broad, from 6.7% to 88.5%.
Range of GSC click filtering.
Image Credit: Kevin Indig

While the slight decrease in filter rates over the past year (likely due to fewer "bot impressions") is positive, the core problem persists. A 38% click-filtering rate and a 75% impression-filtering rate remain catastrophically high, making single-source GSC decisions unreliable.

2025 Impressions Were Highly Inflated

Clicks declined by 56.6% since March 2025.
Image Credit: Kevin Indig

The past 12 months have seen significant volatility in GSC data:

  • In March 2025, Google's expanded AIO rollout led to a 58% increase in impressions for the analyzed sites.
  • Impressions surged further in July (25.3%) and August (54.6%) as SERP scrapers circumvented SearchGuard, capturing "bot impressions" of AIOs.
  • September saw a 30.6% drop in impressions following Google's removal of the num=100 parameter, which SERP scrapers used.
SERP scrapers found a way around SearchGuard.
Image Credit: Kevin Indig

Currently:

  • Clicks have decreased by 56.6% since March 2025.
  • Impressions have normalized, showing a 9.2% decrease.
  • The presence of AIOs has reduced by 31.3%.

While a direct causal link between AIOs and reduced clicks is difficult to quantify precisely, the strong correlation (0.608) suggests a significant impact. AI Overviews are known to reduce clicks, but the exact extent requires measuring CTR for queries before and after AIO implementation.

To determine if click decline is due to AIOs rather than content quality or decay, look for temporal correlation. A sharp drop in clicks coinciding with Google's AIO rollout (e.g., March 2025) suggests AIO impact, whereas poor content quality typically shows a gradual decline. Cross-referencing with position data can also help: if rankings remain stable while clicks fall, it points to AIO cannibalization. Additionally, check if affected queries are informational (more AIO-prone) versus transactional (more AIO-resistant).

Bot Impressions Are Rising

Bot impressions are rising.
Image Credit: Kevin Indig

Evidence suggests a resurgence in SERP scraper activity. Bot-generated impressions can be estimated by filtering GSC data for queries containing more than 10 words and two or more impressions. The likelihood of a human user repeatedly searching for such a long, identical query is negligible.

Logic of Bot Impressions

  • Hypothesis: Humans rarely search for the exact same query of five or more words twice within a short timeframe.
  • Filter: Identify queries with 10+ words that have more than one impression but zero clicks.
  • Caveat: This method may inadvertently capture some legitimate zero-click queries, but it provides a useful directional estimate of bot activity.

Comparing these queries over the last 30, 90, and 180 days revealed that queries with 10+ words and more than one impression grew by 25% over the last 180 days. The range of bot impressions spanned from 0.2% to 6.5% within the last 30 days.

For a typical SaaS site, a "normal" percentage of bot impressions (using the 10+ words, 2+ impressions, 0 clicks filter) generally ranges from 1-3%. Sites with extensive documentation, technical guides, or programmatic SEO pages might see higher rates (4-6%). The 25% growth over 180 days indicates that scrapers are adapting post-SearchGuard. It's more critical to monitor your percentile position within this range than the absolute number.

Bot impressions do not directly affect your actual rankings; their impact is on reporting by inflating impression counts. The practical consequence is the potential misallocation of resources if optimization efforts are directed towards inflated impression queries that human users rarely search for.

The Measurement Layer Is Broken

Relying solely on GSC data for critical SEO decisions has become perilous due to several factors:

  • Three-quarters of impressions are filtered out.
  • Bot impressions can account for up to 6.5% of the data.
  • AI Overviews are reducing clicks by over 50%.
  • Overall user behavior in search is undergoing structural changes.

This presents a significant opportunity for teams that develop robust measurement frameworks. Implementing strategies such as sampling rate scripts, bot-share calculations, and multi-source data triangulation will provide a crucial competitive advantage in understanding true search performance.


Featured Image: Paulo Bobita/Search Engine Journal