AI Agents to Cause Web Congestion: Google's Illyes Warns

Google Search Relations engineer Gary Illyes warns of a looming surge in web traffic caused by the proliferation of AI agents and automated bots. Speaking on Google's Search Off the Record podcast, Illyes stated that "everyone and my grandmother is launching a crawler."

AI Agents Will Strain Websites

Illyes emphasized that while the web is designed to handle large amounts of traffic, the rapid adoption of AI tools will significantly increase website load. These tools, used for content creation, competitor analysis, and data gathering, rely heavily on web crawling.

The web is getting congested… It’s not something that the web cannot handle… the web is designed to be able to handle all that traffic even if it’s automatic.

How Google Crawls the Web

The podcast also detailed Google's unified crawling system. All Google products, including Search, AdSense, and Gmail, utilize the same crawler infrastructure, each identifying itself with a unique user agent.

You can fetch with it from the internet but you have to specify your own user agent string.

This standardized approach ensures consistent adherence to protocols like robots.txt and allows for scaled-back crawling when websites experience issues.

Indexing, Not Crawling, Consumes Resources

Illyes challenged conventional SEO wisdom by asserting that crawling itself consumes minimal resources compared to indexing and processing the data.

It’s not crawling that is eating up the resources, it’s indexing and potentially serving or what you are doing with the data.

This suggests website owners should prioritize database optimization and efficient data handling over concerns about crawl budget.

The Web's Exponential Growth

From indexing a few hundred thousand pages in 1994 to billions today, the web's growth necessitates constant technological advancements. Crawlers have evolved from HTTP 1.1 to HTTP/2, with HTTP/3 on the horizon.

Google's Ongoing Efficiency Battle

Despite Google's efforts to reduce its crawling footprint, the demand for data from new AI products continues to grow, creating an ongoing challenge.

You saved seven bytes from each request that you make and then this new product will add back eight.

Preparing for the AI Crawler Surge

Website owners should take proactive steps to prepare for the influx of AI-driven traffic:

  • Infrastructure: Ensure your hosting can handle the increased load. Evaluate server capacity, CDN options, and response times.
  • Access Control: Use robots.txt to manage which AI agents can access your site. Block unnecessary bots while allowing legitimate ones.
  • Database Performance: Optimize database queries and implement caching to reduce server strain.
  • Monitoring: Analyze logs and track performance to differentiate legitimate crawlers, AI agents, and malicious bots.

The Path Forward

Collaborative solutions, like the Common Crawl model, may offer a way to reduce redundant crawling. While the web is built to handle increased traffic, website owners must prepare for the impact of AI agents. Proactive infrastructure improvements are crucial to weathering the coming surge.

Listen to the full podcast episode:

Featured Image: Collagery/Shutterstock