– Advertisement –

AI processing is undergoing a significant transformation, reshaping how and where workloads are managed. Traditionally, AI models have relied heavily on centralised cloud computing for training and inference. However, advancements in AI hardware are enabling more localised processing, particularly with new chip designs addressing enterprise AI use cases.

In recent years, the AI landscape has evolved rapidly, driven by a convergence of new computing paradigms, changing enterprise needs, and global digital transformation initiatives. Businesses are no longer asking whether to adopt AI; they’re asking how best to deploy it for maximum efficiency and value.

To meet growing demand for faster, more efficient AI processing, hardware designers have developed next-generation processors that enhance on-device AI capabilities, reducing reliance on the cloud for common inference tasks. However, the reality is more nuanced. AI workloads are unlikely to shift entirely to local devices. Instead, the future lies in balancing on-device acceleration, near-edge computing, and cloud-based AI.

As AI becomes increasingly integrated into everyday applications, businesses and technology providers must revisit their infrastructure strategies to ensure both performance and efficiency at scale.

Shifts in computing trends: on-premises, data centres, and specialised chips

The technology industry has experienced cycles between on-premises, cloud, and hybrid models, adapting to evolving business needs. With the growing adoption of AI, investments in specialised hardware have increased.

For instance, Nvidia’s application-specific integrated circuits are designed for deep learning, delivering high-performance AI acceleration. Additionally, integrated neural processing units (NPUs) in AI-enabled PCs now support real-time, low-latency processing for common AI tasks. These developments reflect a broader effort to bring compute closer to data, especially for latency-sensitive use cases.

Despite these advancements, there are inherent limitations to fully localised AI processing:

  • AI workloads vary significantly, and not all tasks can be efficiently executed on a single device.
  • NPUs, while capable, are still general-purpose, leaving large-scale AI model training and complex inference dependent on cloud infrastructure.

Moreover, even with on-device enhancements, the energy demands, thermal limitations, and physical size constraints of edge devices create natural boundaries for local processing. This makes strategic offloading to the cloud or near-edge systems a necessary component of modern AI architectures, and underscores why infrastructure trends continue to evolve in cycles in response to technological change.

Organisations must focus on optimising AI architectures to determine where processing should occur: whether moving data to the processor (cloud-based AI) or moving the processor closer to the data (on-device AI and edge computing).

The role of distributed AI and near-edge computing

Rather than choosing between local or cloud, businesses are increasingly adopting distributed AI models that integrate both approaches. This hybrid setup — a modern iteration of earlier decentralised models — offers greater flexibility and efficiency.

According to Gartner’s 2024 Emerging Tech Impact Radar, over 70% of enterprises will deploy AI inference at the edge by 2027, reducing reliance on centralised cloud infrastructure. Near-edge computing serves as a bridge, reducing data transfer costs and enhancing responsiveness.

There are several key benefits to using a hybrid AI infrastructure. These include:

  • Scalability – AI workloads can shift dynamically between on-device, edge, and cloud computing based on processing needs.
  • Efficiency – Low-latency AI tasks can be processed locally, while more computationally intensive workloads leverage cloud-based resources.
  • Security and compliance – Sensitive AI workloads can be processed closer to the source, reducing exposure risks associated with transmitting data to the cloud.

Will AI workloads shift entirely to local devices?

While AI PCs and on-device processing represent meaningful progress, AI workloads are expected to remain hybrid. Several factors influence how these workloads are distributed:

  • Network and connectivity – AI PCs reduce bandwidth demands, but cloud AI remains essential for running large-scale models.
  • Security and compliance – In regulated industries, local processing may be preferred, though enterprise AI often still requires cloud-based insights and training.
  • Hardware constraints – Local devices have limited processing power, making cloud AI indispensable for training large language models and generative AI applications.

Emerging technologies are enabling new approaches to edge-based AI. These developments are encouraging businesses to explore flexible architectures that incorporate local, edge, and cloud computing in a coordinated manner.

As businesses navigate the evolving AI landscape, they will need to consider:

  • What are the latency requirements of their AI applications?​
  • What are their data privacy and compliance needs?​
  • What are the cost implications of different AI processing models?

The future of AI compute is hybrid

AI-enabled PCs and NPUs represent a substantial leap, but they do not eliminate the need for cloud-based AI. As workloads grow in complexity, businesses must adopt flexible architectures that distribute processing intelligently across multiple environments.

Organisations should focus on optimising AI workload placement by balancing real-time responsiveness, cost efficiency, and performance scalability. The future of AI compute is not about choosing between local and cloud processing; it is about integrating both to achieve the best outcomes in performance, security, and efficiency.

Ultimately, today’s shifts reflect a familiar pattern: As new workloads emerge, infrastructure models adapt. With AI, we’re entering the next phase of that ongoing cycle.