AI Just Outgrew The Server. The Rack-Scale Era Is Here.

The fastest path to AI Infrastructure 2.0 is through purpose-built solutions developed within open ecosystems

We’ve crossed the Rubicon into AI Infrastructure 2.0. There’s no going back.

The evidence is everywhere. The latest reasoning models demonstrate breakthrough capabilities through multi-step processing that demands an order of magnitude more compute than traditional inference. Large language models are setting new performance records across every benchmark, and the common thread is clear: scale fuels performance and performance drives breakthroughs. More computational power results in higher performance AI—which offers better reasoning, more accurate outputs, and entirely new capabilities that seemed impossible just months ago.

This relentless pursuit of AI model performance has fundamentally changed the infrastructure equation and exposed the limits of traditional compute architectures. AI workloads have shattered the traditional unit of compute, ushering in what we call, “AI Infrastructure 2.0”—where all the servers in an entire rack function as a unified computing platform rather than collections of individual servers. The continued exponential leap in computational requirements has pushed us beyond what traditional server architecture can handle.

The Infrastructure Transformation

The numbers tell the story of an industry hitting the limits of traditional architecture. Google’s PaLM demanded 6,144 chips in early 2022, already a massive scale by historical standards. Yet just months later, OpenAI’s GPT-4 training run required approximately 25,000 A100 GPUs, a 4x increase that shattered previous assumptions.¹ Meta’s infrastructure evolution illustrates the acceleration: their Research SuperCluster started with 16,000 A100 GPUs in 2022, but by March 2024, they were operating clusters with 24,576 H100 GPUs each for training Llama 3.² The pace has only intensified. xAI brought online a 100,000 H100 GPU system in September 2024, representing a 16x increase in cluster size in just two years.³

The traditional server-centric approach has hit a wall. Despite massive infrastructure investments (hundreds of billions of dollars) these complex AI systems struggle with utilization challenges that threaten the economics of AI deployment. The sheer computational demand is driving a fundamental architectural shift towards larger and faster GPU pods (racks) connected using scale-up networks and multiple racks connected using scale-out networks, creating entirely new infrastructure requirements that traditional server designs simply cannot accommodate. The transformation is especially significant within the rack—modern AI workloads demand such tight coupling and ultra-low latency communication between hundreds of accelerators that the entire rack must function as a single, unified computing unit. Individually networked servers can no longer provide the performance required to run massive AI models efficiently.

Leading infrastructure providers are making this leap, deploying rack-scale solutions with specialized interconnects that treat the rack—not the server—as the fundamental unit of compute. Cloud providers are deploying purpose-built connectivity solutions that redefine what’s possible in AI architecture with specialized interconnects operating at 900 GB/s, seven times faster than traditional server connections—while scaling to clusters of 130,000+ GPUs. But the stakes are higher than just technical performance. With billions of dollars invested in AI infrastructure, hyperscalers must derive maximum value from every deployment to justify these massive expenditures. Cloud service providers face intense competitive pressure to deliver superior AI capabilities while maintaining cost-effective operations—making total cost of ownership a critical factor in infrastructure decisions. The winning approach must deliver not just peak performance and reliability, but also economic efficiency that translates into competitive advantage.

This is where open standards come in: delivering not only peak performance and reliability through purpose-built open protocols, but also enabling custom accelerators, merchant GPUs, and specialized accelerators to utilize the same open rack infrastructure seamlessly.

Open Ecosystems Enable the Future

At Astera Labs, we believe the fastest path to rack-scale transformation lies in purpose-built solutions developed within open ecosystems. When companies collaborate on common standards, innovation happens in parallel rather than in isolation. Open ecosystems foster robust supply chains that instill confidence and drive competitive advancement. Most importantly: open standards future-proof infrastructure investments as AI technology continues its relentless evolution.

Our board-level participation in UALink, CXL, and other critical standards organizations reflects our commitment to collaborative development. UALink exemplifies this approach—hyperscalers and technology leaders collaborating on a purpose-built scale-up protocol that combines PCIe’s low latency with Ethernet data rates while avoiding vendor lock-in. With broad industry support, UALink represents a multi-billion dollar market opportunity toward a scalable, interoperable AI ecosystem.

“At Astera Labs, we believe the fastest path to rack-scale transformation lies in purpose-built solutions developed within open ecosystems.“

We collaborate with hyperscalers on next-generation rack designs, work with XPU vendors to integrate their accelerators, and partner with system builders on reference implementations. Our COSMOS software suite provides the intelligence layer that helps hyperscalers efficiently manage connectivity infrastrucutre at a rack level. We don’t just connect components—we connect companies, standards, and technologies to create unified solutions that outperform any single-vendor approach.

Join the Transformation

The rack-scale transformation is already underway. Leading hyperscalers and platform providers are deploying purpose-built connectivity solutions with results that speak for themselves: higher utilization, better performance, and the flexibility to adapt as AI workloads continue evolving.

AI Infrastructure 2.0 is being built right now, and the companies that embrace open, purpose-built solutions will define the next decade of AI advancement. Open ecosystems with robust supply chains always endure. The question isn’t whether this transformation will happen—it’s whether your infrastructure strategy will be ready for it.

Join us in accelerating this transformation.

[1] Epoch AI, “Key Trends and Figures in Machine Learning,” accessed June 2025, https://epoch.ai/trends
[2] Meta, “Building Meta’s GenAI Infrastructure,” Engineering at Meta, March 12, 2024, https://engineering.fb.com/2024/03/12/data-center-engineering/building-metas-genai-infrastructure/
[3] Fortune, “Elon Musk’s xAI launched Colossus, billed as the world’s largest AI training cluster,” September 3, 2024, https://fortune.com/2024/09/03/elon-musk-xai-nvidia-colossus/
[4] Epoch AI, “Trends in Machine Learning Hardware,” November 9, 2023, https://epoch.ai/blog/trends-in-machine-learning-hardware & Oracle, “AI Infrastructure,” Oracle Cloud Infrastructure, accessed June 2025, https://www.oracle.com/ai-infrastructure/

AI Just Outgrew The Server. The Rack-Scale Era Is Here.

The fastest path to AI Infrastructure 2.0 is through purpose-built solutions developed within open ecosystems

The Infrastructure Transformation

Open Ecosystems Enable the Future

Join the Transformation

About Jitendra Mohan, Chief Executive Officer, Co-Founder

Share:

Related Articles