{"id":316190,"date":"2026-02-02T09:20:11","date_gmt":"2026-02-02T09:20:11","guid":{"rendered":"https:\/\/www.europesays.com\/ie\/316190\/"},"modified":"2026-02-02T09:20:11","modified_gmt":"2026-02-02T09:20:11","slug":"solving-real-world-ai-bottlenecks","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/ie\/316190\/","title":{"rendered":"Solving Real-World AI Bottlenecks"},"content":{"rendered":"<p>\t\t\t\t\t<a href=\"https:\/\/semiengineering.com\/category-main-page-sld\/\" rel=\"nofollow noopener\" target=\"_blank\">Systems &amp; Design<\/a><\/p>\n<p>SPONSOR BLOG<\/p>\n<p>Tightly coordinated data movement and low-latency on-chip storage for real-time environments.<\/p>\n<p>\t\t\t\t\t\t\t<img decoding=\"async\" class=\"pull-right\" alt=\"popularity\" title=\"popularity\" src=\"https:\/\/www.europesays.com\/ie\/wp-content\/uploads\/2025\/10\/pop_rating_lev1.png\"\/><\/p>\n<p>The race to build smarter and faster AI chips continues to surge. This is especially true in autonomous vehicles that interpret the world in milliseconds, edge accelerators that push trillions of operations per second, hyperscale data-center processors that drive massive workloads, and next-generation consumer devices that demand ever-higher intelligence. As modern system-on-chip (SoC) architectures become increasingly complex, they produce rapidly growing volumes of on-chip data. Managing this data requires increasingly efficient movement, storage, and access. Insufficient data delivery rates create bottlenecks that restrict overall system responsiveness. Cutting-edge designs require tremendous throughput, but they often sit idle because the data arrives too slowly. The result is data starvation. This creates evaporating performance and unpredictable latency.<\/p>\n<p>With system interconnects and memory hierarchies now influencing performance as much as computation itself, SoC teams increasingly rely on FlexGen and FlexNoC-optimized data transport, paired with intelligent on-chip caching, to maintain responsiveness.<\/p>\n<p><img data-recalc-dims=\"1\" fetchpriority=\"high\" decoding=\"async\" class=\"alignnone size-full wp-image-24271656\" src=\"https:\/\/www.europesays.com\/ie\/wp-content\/uploads\/2026\/02\/Arteris_Solving-Real-World-AI-Bottlenecks-fig1.webp.png\" alt=\"\" width=\"1872\" height=\"778\"  \/><\/p>\n<p><strong>Fig. 1: The role of last-level cache in successful designs. (Source: <a href=\"https:\/\/bit.ly\/3KcLBOH\" rel=\"nofollow noopener\" target=\"_blank\">Arteris, Inc.<\/a>)<\/strong><\/p>\n<p>A modern LLC for a modern SoC<\/p>\n<p>As these bottlenecks intensify, SoC designers are increasingly turning to on-chip memory resources to keep data close to compute. Reducing reliance on off-chip memory and minimizing wait cycles is essential to overcoming data starvation across diverse workloads. One of the most effective architectural tools for achieving this is a shared, on-chip last-level cache (LLC).<\/p>\n<p>While an LLC plays a vital role in today\u2019s SoCs, determining the correct configuration is a complex architectural challenge. For example, parameters such as the number of banks, parallel access capabilities, and partitioning strategies all influence the cache\u2019s efficiency under real workloads.<\/p>\n<p>Oversized LLCs add unnecessary silicon cost, while undersized LLCs fail to deliver meaningful benefits. Achieving the optimal balance requires detailed traffic analysis and workload simulation to ensure the cache is tailored precisely to the system\u2019s data movement patterns.<\/p>\n<p>Design teams must consider:<\/p>\n<ol>\n<li>Capacity is sized to provide enough on-chip storage for critical data while staying within area and frequency constraints.<\/li>\n<li>Hit-and-miss behavior depends on how the cache is organized and sized, influencing how effectively data is retained close to the compute.<\/li>\n<li>Eviction behavior must align with workload characteristics so that frequently reused data remains resident while less valuable lines are replaced efficiently.<\/li>\n<li>The choice between scratchpad and cache operation affects determinism and flexibility, requiring a balance between software-managed control and hardware-managed transparency.<\/li>\n<li>Parallelism requirements depend on the number of simultaneous accesses the design must support, driving decisions about banking, ports, and internal concurrency.<\/li>\n<li>Fairness and quality of service considerations ensure that multiple requestors receive predictable access to shared resources without starving lower-priority clients.<\/li>\n<\/ol>\n<p>An essential ingredient in today\u2019s SoCs<\/p>\n<p><a href=\"https:\/\/bit.ly\/3KcLBOH\" rel=\"nofollow noopener\" target=\"_blank\">Arteris<\/a> is uniquely qualified to address these limitations because data movement is the company\u2019s core expertise. Leveraging this foundation, the company created <a href=\"https:\/\/www.arteris.com\/products\/last-level-cache-ip\/codacache\/?utm_campaign=Semiconductor%20Engineering&amp;utm_source=semiengineering\" rel=\"nofollow noopener\" target=\"_blank\">CodaCache<\/a> to mitigate the widening gap between rapidly advancing processors and comparatively slow main-memory access. By storing relevant in-flight data on-chip and coordinating efficiently with external DRAM, it provides a high-bandwidth, power-efficient reservoir that keeps critical data close to compute.<\/p>\n<p>When\u00a0paired with FlexGen, FlexNoC, CodaCache complements the interconnect\u2019s ability to route traffic efficiently across heterogeneous systems:<\/p>\n<ul>\n<li>DRAM accesses\u2014often hundreds of cycles long\u2014are dramatically reduced through higher hit rates, lower latency, and improved bandwidth utilization.<\/li>\n<li>FlexGen and FlexNoC\u2019s topology-aware routing ensures low-contention access paths into CodaCache, improving average-latency behavior across varied workloads.<\/li>\n<li>System simulations from the pairing show:\n<ul>\n<li><strong>Significant latency reduction<\/strong>as hit rates increase, with even moderate hit rates (25%) providing ~13% latency benefit over DRAM-only flows.<\/li>\n<li><strong>DDR traffic reduction of up to 25\u201330%<\/strong>, lowering power and improving efficiency.<\/li>\n<li><strong>Improved bandwidth<\/strong>, especially when short bursts benefit from CodaCache prefetch behavior.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><img loading=\"lazy\" data-recalc-dims=\"1\" decoding=\"async\" class=\"alignnone size-full wp-image-24271657\" src=\"https:\/\/www.europesays.com\/ie\/wp-content\/uploads\/2026\/02\/Arteris_Solving-Real-World-AI-Bottlenecks-fig2.webp.png\" alt=\"\" width=\"1872\" height=\"778\"  \/><\/p>\n<p><strong>Fig. 2: SoC performance is dependent upon data availability. (Source: <a href=\"https:\/\/bit.ly\/3KcLBOH\" rel=\"nofollow noopener\" target=\"_blank\">Arteris, Inc.<\/a>)<\/strong><\/p>\n<p>CodaCache is designed for seamless integration into heterogeneous network-on-chip architectures, delivering a well-balanced combination of capabilities.<\/p>\n<ul>\n<li><strong>Large capacity <\/strong>for storing high-value data, supports multi-instantiation and up to 8 MB per AXI port, with configurable cache-line size, associativity, and cache-organization so that it can be sized appropriately for different SoC workloads.<\/li>\n<li><strong>Shared accessibility <\/strong>for multiple compute engines by providing a shared LLC that multiple processors and accelerators can access, with flexible partitioning to support either shared or isolated usage while reducing external memory traffic.<\/li>\n<li><strong>Significantly reduced latency and power consumption <\/strong>by keeping frequently accessed data on-chip, and improving overall system responsiveness.<\/li>\n<\/ul>\n<p>The result is a subsystem that meaningfully boosts performance across advanced AI compute markets.<\/p>\n<p>As SoC architectures become increasingly complex, with growing numbers of cores, accelerators, concurrent workloads, and chiplets, the need for intelligent, scalable, and automated data management infrastructure continues to grow.<\/p>\n<p>It is no longer enough to build faster compute engines. Designers must feed them efficiently. The combination of\u00a0FlexGen\/FlexNoC and CodaCache\u00a0represents a new class of system IP that acknowledges this shift, delivering tightly coordinated data movement and low-latency on-chip storage for demanding real-time environments.<\/p>\n<p><a href=\"https:\/\/bit.ly\/3KcLBOH\" rel=\"nofollow noopener\" target=\"_blank\">Arteris<\/a> <a href=\"https:\/\/www.arteris.com\/products\/last-level-cache-ip\/codacache\/?utm_campaign=Semiconductor%20Engineering&amp;utm_source=semiengineering\" rel=\"nofollow noopener\" target=\"_blank\">CodaCache<\/a> represents a new class of system IP that recognizes this shift in demanding real-time environments.<\/p>\n<p><\/p>\n<p>\t\t\t\t\t\t\tAndr\u00e9 Bonnardot \u00a0\u00a0<a href=\"https:\/\/semiengineering.com\/author\/andre-bonnardot\/\" rel=\"nofollow noopener\" target=\"_blank\">(all posts)<\/a><\/p>\n<p><\/p>\n<p>\t\t\t\t\t\t\tAndr\u00e9 Bonnardot is Senior Manager of Product Management at Arteris, where he leads Cache Controller solutions and next-generation Network-on-Chip products. Formerly CEO of a semiconductor startup specializing in GaN epitaxy, he brings deep expertise in SoCs and microelectronics. His career includes design and leadership roles at Alcatel, Siemens, Infineon, and Intel. Bonnardot holds a master\u2019s degree in Electronics from ENSERG Engineering School in Grenoble and an Executive MBA from KEDGE Business School.<\/p>\n","protected":false},"excerpt":{"rendered":"Systems &amp; Design SPONSOR BLOG Tightly coordinated data movement and low-latency on-chip storage for real-time environments. The race&hellip;\n","protected":false},"author":2,"featured_media":316191,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[74],"tags":[291,153606,153607,18,19,17,153608,152113,82],"class_list":{"0":"post-316190","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-technology","8":"tag-ai","9":"tag-arteris","10":"tag-cache","11":"tag-eire","12":"tag-ie","13":"tag-ireland","14":"tag-last-level-cache","15":"tag-soc","16":"tag-technology"},"share_on_mastodon":{"url":"https:\/\/pubeurope.com\/@ie\/116000301010793749","error":""},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/posts\/316190","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/comments?post=316190"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/posts\/316190\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/media\/316191"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/media?parent=316190"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/categories?post=316190"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/ie\/wp-json\/wp\/v2\/tags?post=316190"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}