At Web Summit Qatar, ElevenLabs co-founder and CEO Mati Staniszewski declared that voice is poised to become the primary interface for artificial intelligence, fundamentally changing how humans interact with machines. He emphasized that advanced voice models, like those from ElevenLabs, now integrate with large language models' reasoning capabilities, moving beyond mere speech mimicry to enable more natural and sophisticated conversational AI.

Staniszewski envisions a future where smartphones remain in pockets, allowing individuals to immerse themselves fully in the real world, with voice serving as the intuitive mechanism for controlling technology.

This ambitious vision recently propelled ElevenLabs to a successful $500 million funding round, valuing the company at $11 billion. This sentiment is widely echoed across the AI industry, with giants like OpenAI and Google prioritizing voice in their next-generation models. Apple is also reportedly advancing voice-adjacent, always-on technologies through strategic acquisitions. With AI expanding into wearables, vehicles, and other new hardware, voice control is emerging as a critical battleground, shifting interaction away from traditional screens.

Seth Pierrepont, General Partner at Iconiq Capital, reinforced this perspective onstage at Web Summit, suggesting that while screens will remain vital for gaming and entertainment, conventional input methods such as keyboards are becoming "outdated."

Both Staniszewski and Pierrepont highlighted the shift towards "agentic" AI systems, where models will develop persistent memory, context, and integrations. This evolution will allow for more natural interactions, requiring less explicit prompting from users as AI anticipates needs and responds intelligently.

This technological progression will also impact deployment strategies. While high-quality audio models have traditionally resided in the cloud, Staniszewski stated that ElevenLabs is pursuing a hybrid approach, combining cloud and on-device processing. This strategy aims to support emerging hardware like headphones and other wearables, transforming voice into a constant, seamless companion rather than an on-demand feature.

ElevenLabs has already forged a partnership with Meta, integrating its voice technology into products such as Instagram and the virtual reality platform, Horizon Worlds. Staniszewski expressed openness to further collaboration, including Meta's Ray-Ban smart glasses, as voice-driven interfaces expand across various form factors.

However, the increasing pervasiveness of voice technology in everyday hardware raises significant privacy concerns. Questions arise regarding surveillance, data storage, and the extent of personal information collected by voice-based systems, especially as they become more deeply integrated into users' daily lives. This issue has previously led to accusations against companies like Google regarding the alleged misuse of user data.