Home » Products » Leo CXL® Smart Memory Controllers » AI Inferencing: Chatbot Services

AI Inferencing: Chatbot Services

AI Inferencing: Recommendation System | HPC: Computer Aided Engineering | In-Memory Databases: Business Intelligence & Analytics

Boost GenAI performance, Improve xPU Efficiency, & Host More Instances

Limited Memory Capacity for GPU-accelerated Chatbot
Without CXL

Challenge:

AI assistants rely heavily on GPUs for rapid token generation
GPU utilization decreases when data is needed from disk
System memory resources become more constrained as user base grows
AI services have a limited context window due to limited memory

Expanded Memory Capacity for GPU-accelerated Chatbot
With CXL

Solution:

Up to 40% more CPU usage, 40% faster insights, and 200% more instances
Enlarge context window by up to 4TB of memory per socket
Boost token generation with Leo CXL Smart Memory Controllers and LLM engine
Reduce latency and CPU overhead with zero storage I/O

Videos

AI Just Outgrew The Server. The Rack-Scale Era Is Here.

Accelerating Database Performance with Leo: FMS 2023

See how CXL® memory enhances OLTP solutions with increased transaction throughput, reduced infrastructure costs, and improved user experience.

AI Inferencing Demo with CXL®-Attached Memory: FMS 2024

This award-winning demo shows how AI inferencing can gain significant benefits using CXL®-attached memory.

CXL® Memory Pooling: Industry-First Demo

The industry’s first CXL® memory pooling solution to reduce memory stranding, optimize memory utilization and reduce TCO for cloud servers.

Industry-first CXL® 2.0 RAS Capabilities Demo: FMS 2023

See the industry’s first public demonstration of the CXL® 2.0 RAS capabilities of Leo Smart Memory Controllers.

Leo CXL® Memory Controllers Break Through the Memory Wall

See how Leo CXL® Smart Memory Controllers combined with 5th Gen Intel Xeon Scalable Processors increase memory bandwidth and capacity by 50%.

Leo CXL® Smart Memory Controllers: First Look Demo

Get your first look and an end-to-end demonstration of Astera Labs Leo Smart Memory Controllers for CXL® 1.1 and 2.0, the industry’s first purpose-built…

Leo Interop with Intel Xeon 6 Processors

Astera Labs has completed successful interop testing between Leo CXL Smart Memory Controllers and Intel Xeon 6 Processors.

Products

AI Inferencing: Chatbot Services

AI Inferencing: Recommendation System | HPC: Computer Aided Engineering | In-Memory Databases: Business Intelligence & Analytics

Boost GenAI performance, Improve xPU Efficiency, & Host More Instances

Limited Memory Capacity for GPU-accelerated Chatbot Without CXL

Expanded Memory Capacity for GPU-accelerated Chatbot With CXL

Videos

Products

PCI Express®

CXL®

Ethernet

Limited Memory Capacity for GPU-accelerated Chatbot
Without CXL

Expanded Memory Capacity for GPU-accelerated Chatbot
With CXL

PCI Express^®

CXL^®