Unlocking AI Performance: The Role of Scale-Up Interconnects in Accelerated Computing

Jitendra Mohan, Chief Executive Officer, Co-Founder

The rapid advancement of AI—fueled by increasingly complex training and inference workloads and its widespread adoption across industries—is the primary driver of computing demand. The rise of reasoning-based, multi-step AI inference workloads further accelerates this trend. To meet the needs of these compute-intensive applications, such as agentic and autonomous AI, high-performance GPU clusters have become essential.

These emerging AI workloads and GPU clusters demand high bandwidth, low latency scale-up networks with native memory semantic support to synchronize hundreds of accelerators at rack-scale.

The Breadth of Scale-Up Protocols: NVLink, PCIe®, Ethernet, and UALink™

A scale-up fabric is an advanced interconnect architecture designed to seamlessly link multiple GPUs into a unified computing system, effectively functioning as one massive GPU. This approach is essential for large-scale AI deployments where the demands for shared memory access and collaborative processing across GPUs are critical.

The ideal scale-up network should exhibit key characteristics to enable robust performance: low latency for swift communication between GPUs, high bandwidth to handle the immense data throughput of AI workloads, and native memory semantics to ensure seamless data sharing across interconnected GPUs. Additionally, integrating in-network compute capabilities can further enhance efficiency by offloading specific operations, such as reductions or aggregations, to the network itself. These features make scale-up networks indispensable for enabling the collaborative and synchronized tasks required for cutting-edge AI applications.

Several scale-up interconnect protocols are used in modern GPU clusters today and next-generation protocols are evolving to further optimize AI infrastructure, including:

  • NVLink, developed and widely deployed by NVIDIA, enables fast, memory semantic communication between GPUs with the low latency and high bandwidth required for scale-up. NVIDIA SHARP (Scalable Hierarchical Aggregation and Reduction Protocol) enhances NVLink AI scale-up infrastructure by optimizing in-network computing.
  • PCI Express® (PCIe) is widely adopted for its low latency and support for memory semantics, making it suitable for many accelerator workloads.
  • Ethernet offers high data rates and scalability, but it lacks native memory semantics. This means data must be packed and unpacked for transmission, which introduces latency and reduces the effective bandwidth in GPU-to-GPU communication.
  • Ultra Accelerator Link™ (UALink) was recently launched as an open industry standard protocol purpose-built to deliver an optimized scale-up interconnect with the best of both worlds: the low latency and memory semantics of PCIe with the high data rate of Ethernet.
  • Beyond the widely recognized protocols such as NVLink, PCIe, Ethernet, and UALink, other proprietary scale-up interconnect solutions are also deployed across the industry. These protocols, which run on top of either PCIe or Ethernet, are often tailored to meet the unique requirements of specific GPUs or AI accelerators, as well as the demands of particular applications.

Astera Labs’ Expanding Portfolio: Enabling the Future of Scale-up

Astera Labs has been at the forefront of accelerated computing deployments, and we are leading the charge for the AI scale-up transformation by continuing to expand our portfolio to include connectivity solutions for most of the leading scale-up protocols.

This expansion will address the growing diversity of accelerators and GPUs in the AI ecosystem, offering AI datacenter customers more optionality to optimize their scale-up infrastructure with best-of-breed connectivity solutions that address their platform-specific needs.

  • NVLink: Expanding our long-standing relationship with NVIDIA, Astera Labs will provide scale-up connectivity solutions for the new NVLink™ Fusion ecosystem targeting custom compute solutions. 
  • PCIe: Our Scorpio X-Series Fabric Switches support PCIe and platform-specific scale-up connectivity needs. Our fabric switches coupled with our Aries PCIe Smart Retimers and SCM (Smart Cable Modules) for AEC (Active Electrical Cables) form an efficient and easy-to-deploy PCIe-based scale-up fabric. 
  • Ethernet: Our Taurus Ethernet Smart Retimers support extending Ethernet scale-up connectivity for backplane applications and SCMs for active and affordable copper cable applications. 
  • UALink: To support hyperscalers that aim to deploy scale-up GPU clusters based on open standards, we will support a complete portfolio of UALink scale-up connectivity solutions as the landscape of GPUs and AI accelerators integrate this interconnect option.

Summary

By continuing to expand our scale-up connectivity portfolio to include NVLink, PCIe, Ethernet, and UALink, Astera Labs is positioned to address the diverse needs of modern AI scale-up infrastructure. Our diversified portfolio provides hyperscale customers with greater optionality, allowing them to choose the best scale-up interconnect solution for their specific accelerator, GPU, and data center rack-scale configurations.

About Jitendra Mohan, Chief Executive Officer, Co-Founder

Jitendra co-founded Astera Labs in 2017 with a vision to remove performance bottlenecks in data-centric systems. Jitendra has more than two decades of engineering and general management experience in identifying and solving complex technical problems in datacenter and server markets. Prior to Astera Labs, he worked as the General Manager for Texas Instruments’ High Speed Interface Business and Clocking Business. Earlier at National Semiconductor Corp, Jitendra led engineering teams in various technical leadership roles. Jitendra holds a BSEE from IIT-Bombay, an MSEE from Stanford University and over 35 granted patents. In addition to work, Jitendra enjoys outdoor activities and reading about the origins of the Universe.

Share:

Related Articles