Cloudflare's sixth annual Year in Review report for 2025 reveals a significant shift in internet activity, with Googlebot firmly establishing its dominance over other AI crawlers. The comprehensive analysis, drawing on data from Cloudflare's vast global network spanning over 330 cities across 125 countries and handling more than 81 million HTTP requests per second, offers critical insights into internet traffic, security trends, and the burgeoning landscape of AI bot interactions, highlighting Google's unique position in both search indexing and AI model training.
Googlebot Leads AI Crawler Traffic by a Wide Margin
Cloudflare's analysis of successful HTML content requests from leading AI crawlers during October and November 2025 showed Googlebot accessed a remarkable 11.6% of unique web pages within the sample. This figure significantly overshadows its competitors, crawling over three times more pages than OpenAI's GPTBot (3.6%) and nearly 200 times more than PerplexityBot (0.06%). Other notable AI crawlers included Bingbot at 2.6%, followed by Meta-ExternalAgent and ClaudeBot, each accounting for 2.4% of pages.
The report underscores a critical challenge for web publishers: Googlebot's dual role in crawling for both traditional search indexing and AI model training. This dual functionality means that blocking Googlebot's AI training component could inadvertently jeopardize a site's search discoverability, a difficult choice given Google's long-standing dominance in search.
“Because Googlebot is used to crawl content for both search indexing and AI model training, and because of Google’s long-established dominance in search, Web site operators are essentially unable to block Googlebot’s AI training without risking search discoverability.”
Related: Complete Crawler List For AI User-Agents
AI Bots Account for a Growing Share of HTML Requests
Throughout 2025, AI bots (excluding Googlebot) collectively accounted for an average of 4.2% of HTML requests across Cloudflare's customer base. This share fluctuated, starting at 2.4% in early April and peaking at 6.4% in late June. Interestingly, Googlebot alone generated 4.5% of HTML requests, slightly exceeding the combined total of all other AI bots.
The report also tracked the evolving balance between human and bot traffic. While human-generated HTML traffic began 2025 seven percentage points below non-AI bot traffic, it started to surpass non-AI bot traffic on certain days by September. By December 2, humans were responsible for 47% of HTML requests, compared to 44% from non-AI bots.
Crawl-to-Refer Ratios Highlight AI Platforms' Low Referral Rates
Cloudflare's report introduced "crawl-to-refer ratios," a metric that measures how often AI and search platforms crawl websites versus how often they send user traffic back to those sites. A high ratio indicates extensive crawling with minimal user referrals, a point of concern for many publishers.
Among AI platforms, Anthropic exhibited the highest ratios, stabilizing between approximately 25,000:1 and 100,000:1 in the latter half of the year. OpenAI's ratios reached up to 3,700:1 in March. In contrast, Perplexity maintained the lowest ratios among leading AI platforms, generally staying below 400:1 and dropping under 200:1 from September onwards. For context, Google's search crawl-to-refer ratio remained significantly lower, typically ranging from 3:1 to 30:1 throughout the year, underscoring its role in directing users to source content.
User-Action Crawling Sees Significant Growth
Not all AI crawling is solely for model training. "User-action" crawling, where bots visit websites in direct response to user queries posed to chatbots, experienced explosive growth in 2025. This category saw its volume increase more than 15-fold from January through early December.
This trend closely mirrored the traffic patterns of OpenAI's ChatGPT-User bot, which accesses pages when users ask questions within ChatGPT. The growth displayed a distinct weekly usage pattern starting in mid-February, suggesting increased adoption in educational and professional settings, with activity naturally declining during summer breaks (June through August).
AI Crawlers Most Frequently Blocked in Robots.txt
An analysis of robots.txt files across nearly 3,900 of the top 10,000 domains revealed that AI crawlers were the most frequently blocked user agents. GPTBot, ClaudeBot, and CCBot received the highest number of full disallow directives, instructing these crawlers to avoid entire sites.
In contrast, Googlebot and Bingbot showed a different blocking pattern, with disallow directives heavily skewed towards partial blocks. These partial restrictions likely target specific areas like login endpoints and non-content sections, rather than preventing access to entire websites.
Civil Society Becomes Most-Attacked Sector
In a concerning development, organizations within the "People and Society" vertical—encompassing religious institutions, nonprofits, civic organizations, and libraries—became the most targeted sector for cyberattacks for the first time. This sector received 4.4% of global mitigated traffic, a significant jump from under 2% at the year's start, peaking at 23.2% in early July. Many of these vulnerable organizations benefit from protection via Cloudflare's Project Galileo.
Conversely, gambling and games, which was the most-attacked vertical in 2024, saw its share of attacks drop by more than half to 2.6%.
Other Key Findings from the Report
Cloudflare's report also presented several other important findings across internet traffic, security, and connectivity:
- Global Internet traffic increased by 19% year-over-year, with growth accelerating after mid-August.
- Post-quantum encryption now secures 52% of human traffic to Cloudflare, nearly doubling its 29% share from the beginning of the year.
- ChatGPT maintained its position as the top generative AI service globally, while Google Gemini, Windsurf AI, Grok/xAI, and DeepSeek emerged as new entrants in the top 10.
- Starlink traffic doubled in 2025, driven by service launches in over 20 new countries.
- Government-directed shutdowns accounted for nearly half of the 174 major internet outages observed worldwide. Cable cut outages decreased by almost 50%, while power failure outages doubled.
- European countries led in internet quality metrics, with Spain topping the list for overall internet quality and average download speeds exceeding 300 Mbps.
Why These Insights Matter for Publishers and SEO
The data on AI crawlers offers crucial insights for web publishers and SEO professionals. Google's dual-purpose crawler presents a unique competitive advantage; publishers can block other AI crawlers while maintaining Googlebot access for search visibility, but they cannot easily separate Google's search crawling from its AI training activities.
Furthermore, the crawl-to-refer ratios quantitatively confirm what many publishers have long suspected: AI platforms often crawl extensively but direct minimal traffic back to source sites. This gap between crawling and referring varies significantly across different platforms.
Finally, the alarming increase in attacks on civil society organizations highlights a growing need for enhanced digital security measures for nonprofits and advocacy groups.
Looking Ahead: Evolving AI Landscape
Cloudflare anticipates continued evolution in AI metrics, noting the inclusion of several new AI-related datasets in this year's report that were unavailable previously.
The crawl-to-refer ratios are expected to shift as AI platforms refine their search features and referral behaviors; OpenAI's ratios, for instance, already showed some decline throughout the year as ChatGPT search usage increased.
For robots.txt management, the report provides a baseline: most publishers are opting for partial blocks for major search crawlers while implementing full blocks for AI-only crawlers. This year-end snapshot will be vital for tracking how publisher policies adapt in 2026.
Featured Image: Mamun_Sheikh/Shutterstock








