{"id":42104,"date":"2025-04-22T21:26:11","date_gmt":"2025-04-22T21:26:11","guid":{"rendered":"https:\/\/www.europesays.com\/uk\/42104\/"},"modified":"2025-04-22T21:26:11","modified_gmt":"2025-04-22T21:26:11","slug":"two-undergrads-built-an-ai-speech-model-to-rival-notebooklm","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/uk\/42104\/","title":{"rendered":"Two undergrads built an AI speech model to rival NotebookLM"},"content":{"rendered":"<p id=\"speakable-summary\" class=\"wp-block-paragraph\">A pair of undergrads, neither with extensive AI expertise, say that they\u2019ve created an openly available AI model that can generate podcast-style clips similar to <a href=\"https:\/\/techcrunch.com\/2025\/02\/10\/google-expands-notebooklm-plus-to-individual-users\/\" target=\"_blank\" rel=\"noopener\">Google\u2019s NotebookLM<\/a>.<\/p>\n<p class=\"wp-block-paragraph\">The market for synthetic speech tools is vast and growing. ElevenLabs is one of the largest players, but there\u2019s no shortage of challengers (see <a href=\"https:\/\/techcrunch.com\/2024\/11\/25\/playai-clones-voices-on-command\/\" target=\"_blank\" rel=\"noopener\">PlayAI<\/a>, <a href=\"https:\/\/techcrunch.com\/2025\/03\/13\/sesame-the-startup-behind-the-viral-virtual-assistant-maya-releases-its-base-ai-model\/\" target=\"_blank\" rel=\"noopener\">Sesame<\/a>, and so on). Investors believe that these tools have immense potential. <a href=\"https:\/\/my.pitchbook.com\/search-results\/s502639615\/overview_tab\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">According to PitchBook<\/a>, startups developing voice AI tech raised over $398 million in VC funding last year.<\/p>\n<p class=\"wp-block-paragraph\">Toby Kim, one of the Korea-based co-founders of <a href=\"https:\/\/github.com\/nari-labs\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Nari Labs<\/a>, the group behind the newly released model, said that he and his fellow co-founder started learning about speech AI three months ago. Inspired by NotebookLM, they wanted to create a model that offered more control over generated voices and \u201cfreedom in the script.\u201d<\/p>\n<p class=\"wp-block-paragraph\">Kim says they used Google\u2019s TPU Research Cloud program, which provides researchers with free access to the company\u2019s TPU AI chips, to train Nari\u2019s model, Dia. Weighing in at 1.6 billion parameters, Dia can generate dialogue from a script, letting users customize speakers\u2019 tones and insert disfluencies, coughs, laughs, and other nonverbal cues.<\/p>\n<p class=\"wp-block-paragraph\">Parameters are the internal variables models use to make predictions. Generally, models with more parameters perform better.<\/p>\n<p class=\"wp-block-paragraph\">Available from the AI dev platform <a href=\"https:\/\/huggingface.co\/nari-labs\/Dia-1.6B\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Hugging Face<\/a> and <a href=\"https:\/\/github.com\/nari-labs\/dia\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">GitHub<\/a>, Dia can run on most modern PCs with at least 10GB of VRAM. It generates a random voice unless prompted with a description of an intended style, but it can also clone a person\u2019s voice.<\/p>\n<p class=\"wp-block-paragraph\">In TechCrunch\u2019s brief testing of Dia through Nari\u2019s <a href=\"https:\/\/huggingface.co\/spaces\/nari-labs\/Dia-1.6B\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">web demo<\/a>, Dia worked quite well, uncomplainingly generating two-way chats about any subject. The quality of the voices seems competitive with other tools out there, and the voice cloning function is among the easiest this reporter has tried.<\/p>\n<p class=\"wp-block-paragraph\">Here\u2019s a sample: <\/p>\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/www.consumerreports.org\/media-room\/press-releases\/2025\/03\/consumer-reports-assessment-of-ai-voice-cloning-products\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Like many voice generators<\/a>, Dia offers little in the way of safeguards, however. It\u2019d be trivially easy to craft disinformation or a scammy recording. On Dia\u2019s project pages, Nari discourages abuse of the model to impersonate, deceive, or otherwise engage in illicit campaigns, but the group says it \u201cisn\u2019t responsible\u201d for misuse.<\/p>\n<p class=\"wp-block-paragraph\">Nari also hasn\u2019t disclosed which data it scraped to train Dia. It\u2019s possible Dia was developed using copyrighted content \u2014 <a href=\"https:\/\/news.ycombinator.com\/item?id=43754124\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">a commenter<\/a> on Hacker News notes that one sample sounds like the hosts of NPR\u2019s \u201cPlanet Money\u201d podcast. Training models on copyrighted content is a widespread but legally dubious practice. Some AI companies claim that fair use shields them from liability, while rights holders assert that fair use doesn\u2019t apply to training.<\/p>\n<p class=\"wp-block-paragraph\">In any event, Kim says Nari\u2019s plan is to create a synthetic voice platform with a \u201csocial aspect\u201d  on top of Dia and larger, future models. Nari also intends to release a technical report for Dia, and to expand the model\u2019s support to languages beyond English.<\/p>\n","protected":false},"excerpt":{"rendered":"A pair of undergrads, neither with extensive AI expertise, say that they\u2019ve created an openly available AI model&hellip;\n","protected":false},"author":2,"featured_media":42105,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3163],"tags":[323,1942,23699,23700,53,16,15],"class_list":{"0":"post-42104","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-dia","11":"tag-nari-labs","12":"tag-technology","13":"tag-uk","14":"tag-united-kingdom"},"share_on_mastodon":{"url":"https:\/\/pubeurope.com\/@uk\/114383735015195971","error":""},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/42104","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/comments?post=42104"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/posts\/42104\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media\/42105"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/media?parent=42104"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/categories?post=42104"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/uk\/wp-json\/wp\/v2\/tags?post=42104"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}