{"id":475104,"date":"2026-05-08T17:35:11","date_gmt":"2026-05-08T17:35:11","guid":{"rendered":"https:\/\/www.europesays.com\/ie\/475104\/"},"modified":"2026-05-08T17:35:11","modified_gmt":"2026-05-08T17:35:11","slug":"pushing-the-frontier-for-data-agents-with-genie","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/ie\/475104\/","title":{"rendered":"Pushing the Frontier for Data Agents with Genie"},"content":{"rendered":"<p><a data-external-link=\"true\" href=\"https:\/\/www.databricks.com\/blog\/next-generation-databricks-genie\" rel=\"nofollow noopener\" target=\"_blank\">Genie<\/a> is Databricks\u2019 state-of-the-art data agent designed for answering complex questions about enterprise data consisting of both structured (tables, dashboards, notebooks, etc.) and unstructured (workspace files, Google Drive, Sharepoint etc.) data sources. This blog describes some of the unique challenges faced by data agents and introduces techniques to address them, including using specialized knowledge search, parallel thinking, and Multi-LLM designs. From our experiments on an internal benchmark of real-world data analysis tasks, we observe that these techniques can significantly improve the overall accuracy of Genie over a leading coding agent (from 32% to over 90%) while also significantly reducing the costs and latency.<\/p>\n<p><img decoding=\"async\" alt=\"Figure 1: A plot of Genie experiments using different techniques such as specialized knowledge search, parallel thinking, and a Multi-LLM design with optimized prompts.\" data-entity-type=\"file\" data-entity-uuid=\"8c0582ed-5adb-47e8-aece-bca011677320\" height=\"1164\" src=\"https:\/\/www.europesays.com\/ie\/wp-content\/uploads\/2026\/05\/plot-of-Genie-experiments.png\" width=\"1884\" loading=\"lazy\" data-ot-ignore=\"1\"\/><br \/>\nFigure 1: A plot of Genie experiments using different techniques such as specialized knowledge search, parallel thinking, and a Multi-LLM design with optimized prompts.<\/p>\n<p>Key Challenges for Data Agents<\/p>\n<p>Coding agents have shown that a powerful LLM can do incredible things autonomously when equipped with tools that help it understand the code context. While coding agents operate effectively in static, deterministic environments like a disk&#8217;s file system, data agents introduce an entirely new paradigm. Data agents work within a dynamic, constantly evolving data lakehouse that encompasses a wealth of semantic context across hundreds of thousands of tables, notebooks, dashboards, and documents.<\/p>\n<p>For example, consider a real (anonymized) query asked by an internal user in Figure 2: the user notices that two enterprise dashboards reporting the same product&#8217;s revenue show contradictory spikes on different dates and asks the agent to explain why. This reasonable question is deceptively hard because no single data source contains the answer and resolving the question requires cross-system discovery across tables, internal documents, and dashboards, and reasoning about how multi-day reports are set up. Additionally, it requires the agent to dig into enterprise pricing details to find contract rates. Finally, it requires the agent to have an ability to automatically correct itself when intermediate calculations reveal incorrect initial assumptions. The figure shows how the agent is able to successfully solve the task by proceeding in different phases: (1) parallel multi-agent data discovery, (2) data investigation, (3) self-correction loop, and (4) verification.<\/p>\n<p>Compared to Coding Agents, Data Agents have three key unique challenges:<\/p>\n<ul>\n<li data-list-item-id=\"e8211671997a171be7fa7491539ec3620\"><strong>Scale of Data Discovery:<\/strong> Finding the right data sources to answer the user query is one of the biggest challenges with enterprise customers having millions of structured and unstructured sources (like tables, dashboards, and documents), a scale that breaks conventional search methods.<\/li>\n<li data-list-item-id=\"e684da88ac47422d32fb52544abad1075\"><strong>Determining &#8220;Source of Truth&#8221; Business Knowledge:<\/strong> Answering business questions needs deep, specific knowledge drawn from many sources (e.g., table metadata, company documents, internal messages) that are often outdated, contradictory, or superseded, forcing the agent to determine the most authoritative information.<\/li>\n<li data-list-item-id=\"e6512674506c18321545fa324dfd8068b\"><strong>Lack of Verifiable Tests:<\/strong> Unlike coding agents that can use deterministic, verifiable tests to iteratively refine code, data agents have no corresponding test because the &#8220;specification&#8221; is just the high-level user query without a notion of the expected correct answer. Moreover, the queries may not always be answerable because of incompleteness in data, and it is important for data agents to be able to identify such cases and surface it back to users.\u00a0<\/li>\n<\/ul>\n<p><img decoding=\"async\" alt=\"Figure 2: An example trajectory showing how Genie solves a complex user query across different phases: parallel multi-agent asset discovery, data investigation (SQL extraction, comparative analysis, root-cause investigation), self-correction and reconciliation, and final verification.\" data-entity-type=\"file\" data-entity-uuid=\"dc751087-2747-4acf-a301-1e22d547f694\" height=\"1569\" src=\"https:\/\/www.europesays.com\/ie\/wp-content\/uploads\/2026\/05\/trajectory-showing-how-Genie-solves-a-complex-user-query.png\" width=\"2048\" loading=\"lazy\" data-ot-ignore=\"1\"\/><br \/>\nFigure 2: An example trajectory showing how Genie solves a complex user query across different phases: parallel multi-agent asset discovery, data investigation (SQL extraction, comparative analysis, root-cause investigation), self-correction and reconciliation, and final verification.<\/p>\n<p>Key Technical Advances<\/p>\n<p>Figure 3 shows some of the key technical innovations in Genie that enable it to perform significantly better than generic coding agents, namely: i) Specialized Knowledge Search, ii) Parallel Thinking, and iii) Multi-LLM. Specialized knowledge search uses semantic contextual data to ground the asset discovery sub-agents and significantly improve the search quality. Parallel thinking allows the agent to sample multiple different trajectories and then aggregate the findings across trajectories to compute the final answer. Finally, Multi-LLM allows the agent to use different LLMs for each of the different sub-agents together with their optimized prompts to further improve the overall accuracy and latency.<\/p>\n<p><img decoding=\"async\" alt=\"Figure 3: The key technical advances in Genie: i) Specialized Knowledge Search, ii) Parallel Thinking, and iii) Multi-LLM that allow for significant improvements in accuracy and latency.\" data-entity-type=\"file\" data-entity-uuid=\"4213715d-c3cb-4b68-908d-c51e161819e2\" height=\"818\" src=\"https:\/\/www.europesays.com\/ie\/wp-content\/uploads\/2026\/05\/key-technical-advances-in-Genie.png\" width=\"1350\" loading=\"lazy\" data-ot-ignore=\"1\"\/><br \/>\nFigure 3: The key technical advances in Genie: i) Specialized Knowledge Search, ii) Parallel Thinking, and iii) Multi-LLM that allow for significant improvements in accuracy and latency.<\/p>\n<p>Specialized Knowledge Search<\/p>\n<p>Genie uses the existing data assets such as workspace tables, notebooks, dashboards, documents, and files to derive a rich semantic enterprise context and then uses this context to construct a search index. It uses multiple search indices in parallel together with rich metadata signals to efficiently discover most relevant assets for a user query. Figure 4 demonstrates how leveraging the specialized knowledge search helps Genie improve table search performance by up to 40% on our table discovery benchmarks.\u00a0<\/p>\n<p><img decoding=\"async\" alt=\"Figure 4: Comparison of Specialized Knowledge Search for Table Search performance.\" data-entity-type=\"file\" data-entity-uuid=\"c9eefcab-7b80-43c7-9eeb-4acc3807a7c8\" height=\"1270\" src=\"https:\/\/www.europesays.com\/ie\/wp-content\/uploads\/2026\/05\/Comparison-of-Specialized-Knowledge-Search-for-Table-Search-performance.png\" width=\"2048\" loading=\"lazy\" data-ot-ignore=\"1\"\/><br \/>\nFigure 4: Comparison of Specialized Knowledge Search for Table Search performance.<\/p>\n<p>Parallel Thinking<\/p>\n<p>Unlike software engineering tasks, where coding agents can first write tests to verify the desired functionality and then iterate on code generation until the tests pass, the open-ended data queries don&#8217;t have such corresponding unit tests. In the absence of tests, it becomes challenging for data agents to know if the generated answer is correct or needs more refinement. To address this challenge, we leverage parallel thinking by sampling multiple trajectories and aggregating relevant information across the trajectories to compute the final answer. Figure 5 shows how parallel thinking can significantly improve the answer accuracy, although with some additional latency and token costs. Furthermore, as shown in Figure 1, combining Multi-LLM and further optimizations can further significantly reduce costs and latency.<\/p>\n<p><img decoding=\"async\" alt=\"Figure 5: Adding parallel thinking improves overall performance across both GPT-5.4 and Opus-4.6.\" data-entity-type=\"file\" data-entity-uuid=\"b1842398-2763-47c8-9b26-09ce10215558\" height=\"634\" src=\"https:\/\/www.europesays.com\/ie\/wp-content\/uploads\/2026\/05\/parallel-thinking.png\" width=\"1284\" loading=\"lazy\" data-ot-ignore=\"1\"\/><br \/>\nFigure 5: Adding parallel thinking improves overall performance across both GPT-5.4 and Opus-4.6.<\/p>\n<p>Multi-LLM<\/p>\n<p>One of the key technical advances in Genie is the ability to leverage different LLMs for different sub-agents as we observe different LLMs are good at complementary capabilities. For example, it can use a different LLM for the planning stage, a different LLM for various search sub-agents, a different one for code generation and judges. With the Databricks platform, it is seamless to try out any of the frontier models (including Opus, GPT, and Gemini), open-source models, as well as custom trained models. In addition to accuracy, we also observe that different LLMs result in very different latency and cost characteristics. Figure 6 shows how different LLMs perform on table search tasks and how the corresponding accuracy and cost can be further optimized using methods like <a data-external-link=\"true\" href=\"https:\/\/arxiv.org\/abs\/2507.19457\" target=\"_blank\" rel=\"noopener noreferrer nofollow\">GEPA<\/a>.<\/p>\n<p><img decoding=\"async\" alt=\"Figure 6: Optimizing the accuracy and cost for different LLMs for Table Search using GEPA.\" data-entity-type=\"file\" data-entity-uuid=\"762cc26b-84de-48ac-9695-12c0b5b28c57\" height=\"1449\" src=\"https:\/\/www.europesays.com\/ie\/wp-content\/uploads\/2026\/05\/Optimizing-the-accuracy-and-cost-for-different-LLMs-for-Table-Search-using-GEPA.png\" width=\"2048\" loading=\"lazy\" data-ot-ignore=\"1\"\/><br \/>\nFigure 6: Optimizing the accuracy and cost for different LLMs for Table Search using GEPA.<\/p>\n<p>Conclusion<\/p>\n<p>While coding and data analysis share many conceptual similarities, the dynamic nature of enterprise data systems create some unique challenges. Data agents need to efficiently discover the right assets from a large enterprise context, determine \u201ctruth\u201d in an ambiguous environment and write efficient code and queries to correctly answer user&#8217;s questions. We developed several novel approaches to solve these problems such as specialized knowledge search to leverage rich semantic information and multiple metadata signals, Multi-LLM to leverage different LLMs with optimized prompts using GEPA, and parallel thinking to further improve the overall accuracy. Adding these approaches to Genie helps it perform significantly better than leading coding agents on the benchmark tasks. There are still a lot of challenging open-ended questions left to explore, and it has never been a more exciting time to explore research in this area of building state-of-the-art data agents for enterprises.<\/p>\n","protected":false},"excerpt":{"rendered":"Genie is Databricks\u2019 state-of-the-art data agent designed for answering complex questions about enterprise data consisting of both structured&hellip;\n","protected":false},"author":2,"featured_media":475105,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[261],"tags":[291,289,290,18,19,17,82],"class_list":{"0":"post-475104","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-eire","12":"tag-ie","13":"tag-ireland","14":"tag-technology"},"share_on_mastodon":{"url":"https:\/\/pubeurope.com\/@ie\/116540167593794480","error":""},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/posts\/475104","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/comments?post=475104"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/posts\/475104\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/media\/475105"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/media?parent=475104"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/categories?post=475104"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/tags?post=475104"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}