{"id":29692,"date":"2026-05-06T16:12:21","date_gmt":"2026-05-06T16:12:21","guid":{"rendered":"https:\/\/www.europesays.com\/ai\/29692\/"},"modified":"2026-05-06T16:12:21","modified_gmt":"2026-05-06T16:12:21","slug":"krisp-launches-viva-2-0-introducing-voice-infrastructure-for-voice-ai-agents","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/ai\/29692\/","title":{"rendered":"Krisp Launches VIVA 2.0, Introducing Voice Infrastructure for Voice AI Agents"},"content":{"rendered":"<p><img decoding=\"async\" src=\"https:\/\/krisp.ai\/wp-content\/themes\/krisp-v4\/imgs\/img_logo_main.svg\" alt=\"Krisp logo\"\/><br \/>\nThe complete real-time voice infrastructure suite is now predictive, multilingual, and built for real-world production audio.<\/p>\n<p><a href=\"https:\/\/krisp.ai\/\" target=\"_blank\" rel=\"noopener nofollow\">Krisp <\/a>launched Krisp VIVA 2.0, the voice AI infrastructure layer for voice agents, IVRs, and conversational AI. The release introduces a new generation of small, real-time models that improve WER, predict when users finish speaking, classify interruptions, and read perceptual signals like synthetic speech, gender, and accent, setting a new benchmark for how voice agents handle audio in production.<\/p>\n<p id=\"pull-quote\" class=\"font-figtree text-lg lg:text-[24px] leading-6 lg:leading-[38px]\">Krisp\u2019s complete real-time voice infrastructure suite is now predictive, multilingual, and built for real-world production audio.<\/p>\n<p>Voice agent usage\u00a0grew 9x in 2025, yet most voice agents still fail in the same predictable ways the moment they leave a demo room. Background voices and noises push speech-to-text word error rates from 5% to\u00a0over 30%. Voice activity detection misfires on background voices; bots can ignore real interruptions or hallucinate them. And on telephony, the agent\u2019s own voice can loop back through the mic and trigger self-interruption.<\/p>\n<p>Voice AI systems today are built on STT, LLMs, and TTS. What\u2019s been missing is a layer to handle real-world audio and conversational dynamics before those systems engage.<\/p>\n<p>VIVA fills that gap to ensure AI agents function in messy, real-world environments.<\/p>\n<p>Krisp\u2019s VIVA SDK runs server-side directly in each customer\u2019s audio pipeline before STT, improving reliability across the entire stack.<\/p>\n<p>Also Read:\u00a0<a href=\"https:\/\/aithority.com\/interviews\/aithority-interview-with-glenn-jocher-founder-ceo-ultralytics\/\" target=\"_blank\" rel=\"noopener nofollow\">AiThority Interview with Glenn Jocher, Founder &amp; CEO, Ultralytics<\/a><\/p>\n<p>What\u2019s new in VIVA 2.0:<\/p>\n<p>Turn Prediction v3:\u00a0A new multilingual model that predicts end-of-turn from audio alone, no transcription needed. Reacts quickly to real turn-ends while holding through mid-sentence pauses \u2014 low-latency responses without the agent cutting users off. Tiny enough to run on standard CPUs or locally, on-device for robotics and conversational toys.<br \/>\nInterrupt Prediction v1:\u00a0A first-of-its-kind audio-only classifier that predicts when a user is intending to interrupt the agent (start-of-turn prediction). Distinguishes intent-to-take-the-floor from backchannel speech like \u201cyes\u201d or \u201cmhm.\u201d Different from end-of-turn prediction, which detects when the user has finished speaking. Patent filed.<br \/>\nSignal Detectors:\u00a0A new category of real-time audio models that give voice AI the perceptual cues humans use without thinking.<br \/>Three launching with VIVA 2.0:<\/p>\n<p>TTS Detector: Detects synthetic speech in real time. Use case: an outbound voice AI agent calls a number and recognizes when an inbound voice AI agent or IVR picks up.<br \/>\nAccent Detector: Identifies the speaker\u2019s accent so audio can be routed to the STT model best tuned for it, lifting transcription quality.<br \/>\nGender Detector: Identifies speaker gender to enable personalized responses.<\/p>\n<p>Voice Isolation v3:\u00a0The world\u2019s most widely used voice isolation model has been upgraded to deliver measurable improvements in downstream WER.<\/p>\n<p>All models run on standard server CPUs, operate on audio input alone with no transcription required, and are bundled into existing VIVA pricing at no additional charge.<\/p>\n<p>Krisp has spent over eight years solving real-world voice in production, first for human-to-human conversations and now for human-to-AI. That experience gives VIVA the depth of training data and field-tested reliability nothing else in the market can match.<\/p>\n<p>Krisp VIVA SDK processes more than 12 billion minutes of voice AI agent traffic a year and is embedded in over 130 voice AI products, including Daily, Vapi, LiveKit, Ultravox, Telnyx, the world\u2019s leading AI labs, and the largest enterprise contact centers.<\/p>\n<p>Platforms running VIVA report:<\/p>\n<p>3.5x improvement in turn-taking accuracy<br \/>\n50% fewer dropped calls<br \/>\n30% higher customer satisfaction<\/p>\n<p>\u201cAt scale, the biggest challenge in voice AI isn\u2019t the model. It\u2019s the quality of the signal going into it,\u201d said David Casem, CEO of Telnyx. \u201cKrisp addresses that at the source, which improves everything downstream from transcription to response.\u201d<\/p>\n<p>\u201cVoice is becoming the primary interface between humans and AI,\u201d said Robert Schoenfield, EVP of Licensing and Partnerships at Krisp. \u201cThose conversations don\u2019t happen in clean environments. They happen in the real world, shaped by noise and subtle human cues. VIVA brings that layer into the system, so voice agents can operate the way people actually speak.\u201d<\/p>\n<p>Also Read:\u00a0<a href=\"https:\/\/aithority.com\/machine-learning\/the-infrastructure-war-behind-the-ai-boom\/\" target=\"_blank\" rel=\"noopener nofollow\">\u200b\u200bThe Infrastructure War Behind the AI Boom<\/a><\/p>\n<p>[To share your insights with us, please write to\u00a0<a href=\"https:\/\/aithority.com\/machine-learning\/krisp-launches-viva-2-0-introducing-voice-infrastructure-for-voice-ai-agents\/mailto:psen@itechseries.com\" target=\"_blank\" rel=\"noopener nofollow\">psen@itechseries.com<\/a> ]<\/p>\n","protected":false},"excerpt":{"rendered":"The complete real-time voice infrastructure suite is now predictive, multilingual, and built for real-world production audio. Krisp launched&hellip;\n","protected":false},"author":2,"featured_media":29693,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[405,7537,872,19465,19466,19467,19468,19469,2225,19470,19471,19472,19473,19474],"class_list":{"0":"post-29692","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-agentic-ai","8":"tag-ai-agents","9":"tag-artificial-intelligence-agents","10":"tag-ceo","11":"tag-daily","12":"tag-david-casem","13":"tag-krisp","14":"tag-krisp-viva-2-0","15":"tag-livekit","16":"tag-llms","17":"tag-stt","18":"tag-telnyx","19":"tag-ultravox","20":"tag-vapi","21":"tag-voice-ai-infrastructure"},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/posts\/29692","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/comments?post=29692"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/posts\/29692\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/media\/29693"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/media?parent=29692"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/categories?post=29692"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/tags?post=29692"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}