Find answers to the most frequently asked questions about our products and technologies.
FAQs
400G/800G Ethernet FAQ
Aries SCMs are an offering within Astera Labs’ COSMOS suite enables system baseboard/system management controllers (BMCs/SMCs) to utilize an array of customizable diagnostics and telemetry features to enable continuous monitoring of critical server-to-JBOG, JBOG-to-JBOG, and Switch-to-JBOG links. Parameters such as eye opening, equalization levels, junction temperature, and more are monitored, and interrupts to the host can be enabled whenever configurable limits are crossed. A full set of self-test features—host-side and line-side loopback, pseudo-random bit sequence (PRBS) generation and checking, etc.—enable rapid troubleshooting to minimize link down time and accelerate fault isolation.
- NRZ is a modulation technique that has two voltage levels to represent logic 0 and logic 1. PAM4 uses four voltage levels to represent four combinations of two bits logic – 11, 10, 01, and 00.
- PAM4 has the advantages of halving the Nyquist frequency and doubling the throughput for the same Baud rate. This alleviates the need for designers to have to invent infrastructure like silicon and cables that would go up to 50GHz bandwidth.
- The SNR loss of a PAM4 signal compared to an NRZ signal is ~9.5 dB.
We offer complete CMIS Firmware update procedures in the product datasheet.
A user can update module management functions, adaption algorithms, and full-module firmware even after the cable is deployed to the switch system.
Taurus offers various firmware and setting updates to adapt to diverse system topologies, including firmware flexibility, in-field upgrade support, health monitoring and debug, and CMIS extension.
Taurus Smart Cable Modules’ advanced fleet management capabilities include Full CMIS Features, Security, and Extensive Diagnostics (Cable Degradation Monitoring, Host-Cable Security, Multiple Loopback Modes, and Pattern Generation/Checking)
In an average 3m 34 AWG copper wire, the typical channel loss is about 28dB at 12.9GHz, but might be as high as 36dB during worst cases.
A Taurus Smart Cable Module with gearbox capability can be used on the NIC to resolve the per-lane rate disparity and reduce the end-to-end channel loss, thereby increasing the cable reach and/or reducing cable gauge.
Smart Electrical Cables (SEC) support longer reach and thinner cabling while adding security and diagnostic capability.
- Rate mismatches between NIC and switch lead to wasted switch bandwidth.
- Traditional DAC interconnects are too short, thick and bulky to handle high speed ethernet signals between ToR switches and multiple racks.
Taurus Smart Cable Modules can provide gearbox functionality at 200GbE from 4x50G to 8x25G.
- Switch-to-server: ToR Switches to Network Interface Cards (NIC) interconnects on a server.
- Switch-to-switch: within a spine switch and spine switch to Exit Leaf interconnects.
- Active Optical Cables (AOC) can be used for rate conversion and to achieve thin wire profile. However, such optical designs incur additional costs, reliability concerns, and require more power.
- Active Copper Cables (ACC) can be used for rate conversion, have a lower design cost when compared to AOC while also supporting even thinner gauge cabling as compared to passive DACs. General purpose ACCs are limited by their lack of diagnostics and security features.
- Smart Electrical Cables (SEC) that utilize Taurus Smart Cable Modules have all the benefits of an ACC with the added “”smarts”” required by Cloud Service Providers.
- Optical modules have high power consumption. A 400G module consumes around 12W and a 800G module may consume up to 20W.
- Optical modules require advanced low loss materials which are expensive.
- Optical modules have a shorter lifespan and are less reliable compared to active copper cables. Data center operators need to constantly maintain and replace the failed optical modules.
- At 50Gbps/lane, passive direct-attach copper (DAC) cables barely reach 3-meters. At 100Gbps/lane, DACs may only have a 2-meter practical reach limit.
- The switch PCB consumes too much of the channel budget, which then limits the cable reach and increases cable gauge.
- DACs are rigid, heavy, and bulky, restricting airflow for system cooling and making rack servicing difficult.
CXL® FAQ
The CXL 3.0 specification doubles the bandwidth to 64 GT/s while enabling additional usage models beyond the CXL 2.0 specification through introduction of advanced fabric capabilities such as the following:
- Global Fabric Attached Memory (GFAM)
- Enhanced link level integrity and data encryption (CXL IDE) for 256B flits
- Improved resource utilization for composable disaggregated infrastructure through multi-level switching, multi-headed devices, and multiple type1/type2 devices per root port
CXL 2.0 supports Integrity and Data Encryption (IDE) and key exchange protocols for to provide end-to-end protection of data on the CXL link.
CXL 2.0 adds support for switching, persistent memory, and security as well as memory pooling support to maximize memory utilization, reducing or eliminating the need to over-provision memory.
CXL runs on PCIe® 5.0 electrical signals. CXL runs on PCIe PHY and supports x16, x8, and x4 link widths natively.
- CXL.io is used for initialization, link-up, device discovery and enumeration, and register access. It provides a non-coherent load/store interface for I/O devices similar to PCIe® 5.0.
- CXL.cache defines interactions between a Host and Device, which allows CXL devices to cache host memory with low latency.
- CXL.mem provides a Host processor with direct access to Device-attached memory using load/store commands.
- Memory tiering in which additional capacity is applied with a variable mix of lower-latency direct-attached memory and higher-latency large capacity memory
- Higher VM density per system by having more memory capacity attached
- Large databases can use a caching layer provided by SCM to improve the performance
The CXL protocol supports three different type of devices:
- Type 1 Caching Devices / Accelerators
- Type 2 Accelerators with Memory
- Type 3 Memory Buffer
Compute Express Link (CXL) is an open industry standard interconnect offering high-bandwidth, low-latency connectivity between the host processor and devices including accelerators, memory expansion, and smart I/O devices. CXL utilizes the PCIe® 5.0 physical layer infrastructure and the PCIe alternate protocol to address the demanding needs of high-performance computational workloads in Artificial Intelligence, Machine Learning, communication systems, and HPC through the enablement of coherency and memory semantics across heterogeneous processing and memory systems.
Traditional DRAM and persistent storage class memory (SCM) are supported, allowing for flexibility between performance and cost.
CXL is needed to overcome CPU-memory and memory-storage bottlenecks faced by computer architects. Future data centers need heterogeneous compute, new memory and storage hierarchy, and an agnostic interconnect to tie it all together. CXL maintains memory coherency between the processor memory space and memory on attached devices to enable pooling and sharing of resources to provide higher performance, reduce software stack complexity, and lower overall system cost.
Ordering FAQ
Please review the Astera Labs Terms of Sale.
To ensure a consistent supply to meet our customer’s high volume demands, Astera Labs implements in multi-vendor and multi-site manufacturing. This approach gives us a strong business continuity/contingency plan in case of catastrophic events (e.g., earthquake, tsunami, flood, fire, etc.) to ensure or recover supply quickly.
Purpose-built Retimer IC’s, Riser Cards, Extender Cards, and Booster Cards for High-performance Server, Storage, Cloud, and Workload-Optimized Systems
Customers can order directly from Astera Labs, or can order from one of our franchised partners, which currently include Mouser, EDOM, Eastronics, and Intron.
PCIe® FAQ
The largest challenge will be handling higher error rates. To address this, the PCIe 6.0 standard will also begin to implement Forward Error Correction (FEC).
PAM4 stands for Pulse Amplitude Modulation Level 4, and is a type of signaling that caries 2 bits (00, 01, 10, or 11) at a time instead of 1 bit (0 or 1) used in previous PCIe generations.
The PCIe 5.0 specification introduces selectable Precoding. Precoding breaks an error burst into two errors: an entry error and an exit error. However, a random single-bit error would also be converted to two errors, and therefore a net 1E-12 BER with precoding disabled would effectively become 2E-12 BER with precoding enabled.
The enabling/disabling or Precoding is negotiated during link training. Whether Precoding is needed or not is largely dependent on the specific receiver implementation. As an example, receivers that rely heavily on DFE tap-1 may choose to request Precoding during link training. So, each receiver will make its own determination, based on the receiver architecture, as to whether it should request Precoding or not. Precoding is defined in the PCIe 5.0 specification but not in the PCIe 4.0 specification.
Passing TX compliance and RX BER test does not guarantee system-level interoperability. It is advisable to perform separate tests to exercise the LTSSM, as well as application-specific tests, such as hot unplug/hot plug, to demonstrate system-level robustness.
33 GHz for the PCIe 5.0 TX test. See more from PCIe 5.0 PHY Test Spec v0.5.
The Lane Margin Test (LMT) is defined in PCIe 5.0 PHY Test Spec v0.5, and RX Lane Margining in time and voltage is required for all PCIe 5.0 receivers. However, according to the test specification, LMT checks whether the add-in card under test implements the lane margining capability. The margin values reported are not checked against any pre-defined pass/fail criteria.
At this moment, these are not specified in the PCIe 5.0 PHY Test Spec v0.5.
No, there is no difference.
Test methodology is similar to that of CEM 4.0. See details from the PCIe 5.0 PHY Test Spec v0.5.
There is no industry-standard definition of mid-loss, low-loss, and ultra-low-loss. It is good practice to start from the loss budget analysis to select which type of PCB material is needed for the system. Megtron-6 or other types of PCB material with similar performance as that of Megtron-6 are commonly used in PCIe 5.0 server systems where the distance from Root Complex pin to CEM connector exceeds 10″.
There are multiple connector types and form factors in development, which are targeting PCIe 5.0 signal speeds, including: M.2, U.2, U.3, mezzanine connectors, and more.
PCI-SIG defines the specifications, but not a tool for the purpose of interoperability testing. ASIC vendors and OEMs/ODMs generally provide/have these tools, for the purpose of testing and stressing the PCIe link, to make sure there are no interoperability issues.
PCI-SIG does not publish official or “standard” channel models; however, the Electrical Workgroup (EWG) does post example channel models. For PCIe 5.0 specification, the reference package models are posted here: https://members.pcisig.com/wg/PCIe-Electrical/document/folder/885.
You can also find example pad-to-pad channel models shared by a few member companies during the specification development by searching *.s24p in the following folder https://members.pcisig.com/wg/PCIe-Electrical/document.
Burst errors are not reported any differently than regular correctable/uncorrectable errors. In fact, burst errors may cause silent data corruption, meaning multiple bits in error can lead to an undetected error event. Therefore, it is incumbent on system designers and PCIe component providers to consciously enable precoding if there is a concern or risk of bust errors in a system.
PCIe 5.0 architecture, like PCIe 4.0 and 3.0 architectures, supports two clock architectures:
- Common REFCLK (CC): The same 100-MHz reference clock source is distributed to all components in the PCIe link — Root Complex, Retimer, and Endpoint. Due to REFCLK distribution via PCB routing, fanout buffers, cables, etc., the phase of the REFCLK will be different for all components.
- Independent REFCLK (IR): Both the Root Complex and End Point use independent reference clocks and the Tx and Rx must meet stringent specifications operating in IR mode compared to the specifications under CC mode. The PCIe Base specification does not specify the properties of independent reference clocks.
In an add-in-card topology, merely 16 dB system board budget remains, equivalent to ~8 inch trace length, when adding safety margin for board loss variations due to temperature and humidity, even if upgrading to a ultra-low-loss PCB material. Upgrading to expensive “Ultra-low-loss” material will enable ~8 inches. However, the reach requirements can easily exceed ~8 inch in complex topologies.
- As the PCB temperature rises, the insertion loss (IL) of the PCB trace becomes higher
- Process fluctuation during PCB manufacturing can result in slightly narrower or wider line widths, which can lead to fluctuations in IL
- The amplitude of the Nyquist frequency signal (16-GHz sine wave in the case of 32 GT/s NRZ signaling) at the source side is 800 mV pk-pk, which will reduce to about 12.7 mV after 36 dB of attenuation. This underscores the need to leave some IL margin for the receiver to account for reflections, crosstalk, and power supply noise that all potentially will degrade the SNR.
Thus, the IL budget reserved for the PCB trace on the system base board should be 16 dB minus some amount of margin, which is reserved for the above factors. Many hardware engineers and system designers tend to leave 10-20% of the overall channel IL budget as margin for such factors. In the case of a 36-dB budget, this amounts to 4-7 dB.
The main independent variable in PCIe Link simulations is Transmitter Preset—pre-defined combinations of pre-shoot and de-emphasis, and 10 such Presets are defined in the PCIe specification.
View Signal Integrity Challenges for PCIe 5.0 OCP Topologies Video >
PCIe 6.0 will adopt PAM4 signaling instead of NRZ used in previous generations to achieve 64GT/s. However, it will remain fully backwards compatible with PCIe 1.0 through PCIe 5.0. Please see our industry news sections for more resources on PCIe 6.0.
By leveraging advanced PCB materials and/or PCIe 5.0 Retimers to ensure sufficient end-to-end design margin, system designers can ensure a smooth upgrade to PCIe 5.0 architecture.
16 dB, but the channel imperfections caused by vias, stubs, AC coupling capacitors and pads, and trace variation further reduce this budget.
View PCIe 5.0 Architecture Channel Insertion Loss Budget Video >
- CTLE & DFE: PCIe 5.0 specifies the bump-to-bump IL budget as 36 dB for 32 GT/s, and the bit error rate (BER) must be less than 10-12. To address the problem of high attenuation to the signal, the PCIe 5.0 standard defines the reference receiver such that the continuous-time linear equalizer (CTLE) model includes an ADC (adjustable DC gain) as low as -15 dB, whereas the reference receiver for 16 GT/s is only -12 dB. The reference decision feedback equalizer (DFE) model includes three taps for 32 GT/s and only two taps for 16 GT/s.
- Precoding: Due to the significant role of the DFE circuit plays in the receiver’s overall equalization, burst errors are more likely to occur compared to 16 GT/s. To counteract this risk, PCIe 5.0 introduces Precoding in the protocol. After enabling precoding at the transmitter side and decoding at the receiver side, the chance of burst errors is greatly reduced, thereby enhancing the robustness of the PCIe 5.0 32 GT/s Link.
As the demand for artificial intelligence and machine learning grows, new system topologies based on PCIe 5.0 technology will be needed to deliver the required increases to data performance.
While the transition from PCIe 4.0 architecture to PCIe 5.0 architecture increases the channel insertion loss (IL) budget from 28 dB to 36 dB, there will be new design challenges around the higher losses at higher data rates. In the case of other standards greater than 30 GT/s, the PAM-4 modulation method is usually used to make the signal’s Nyquist frequency one-quarter of the data rate, at the cost of 9.5 dB signal-to-noise ratio (SNR).
However, PCIe 5.0 continues to use the non-return-to-zero (NRZ) signaling scheme, thus the Nyquist frequency of the signal is one-half of the data rate, which is 16 GHz. The higher the frequency, the greater the attenuation. The signal attenuation caused by the channel IL is the biggest challenge of PCIe 5.0 system design.
- Within a Server: CPU to GPU, CPU to Network Interface Card (NIC), CPU to Accelerator, CPU to SSD
- Within a Rack: CPU to JBOG and JBOF through board-to-board connector or cable
- Emerging GPUs-to-GPUs or Accelerators-to-Accelerators interconnects
PCIe/CXL Smart Cable Modules FAQ
Aries SCMs extend high-bandwidth PCIe 5.0 signal reach at 128 GB/s up to 7 meters to enable larger GPU clusters in a multi-rack architecture and low-latency memory fabrics for scalable cloud infrastructure.
There are two key differences between Ethernet AECs vs PCIe AECs:
- Protocol complexity: PCIe’s backwards compatibility and link training requirements make AECs more complex for PCIe compared to Ethernet.
- Interoperability: The variety of device types and ecosystem players is significantly more for PCIe compared to Ethernet
General-purpose AECs lack advanced cable and fleet management features essential managing data center infrastructure, while AECs with Aries SCMs offer system-wide visibility and management features through COSMOS that enable enhanced security, quick debug, and flexible firmware upgrade.
There are two primary applications for Aries Smart Cable Modules.
The first is to enable higher bandwidth and lower latency GPU-to-GPU and GPU-to-Switch multi-rack connectivity for larger AI clusters. As larger clusters of GPUs are deployed to address the increasing bandwidth and memory demands of AI workloads, AI infrastructure must scale GPU clusters across racks as server racks can only accommodate a certain number of GPUs due to power and thermal management constraints. Aries SCMs extend high-bandwidth PCIe 5.0. and CXL signal reach at 128 GB/s up to 7 meters to enable larger GPU clusters in a multi-rack architecture. Also, Aries SCMs improve cable routing, serviceability, and air flow with thin copper cables to maintain existing rack power and thermal density.
The second is to enable extended CXL reach for low-latency memory fabric connectivity in high-capacity in-memory compute architectures. Hyperscalers are deploying CXL memory expansion and pooling solutions to achieve higher application performance and the distances between the processor and expanded memory resources are increasing. Aries SCMs extend high-bandwidth PCIe 5.0 and CXL signal reach at 128 GB/s up to 7 meters to enable low-latency memory fabrics for scalable cloud infrastructure.
Quality FAQ
To ensure a consistent supply to meet our customer’s high volume demands, Astera Labs implements in multi-vendor and multi-site manufacturing. This approach gives us a strong business continuity/contingency plan in case of catastrophic events (e.g., earthquake, tsunami, flood, fire, etc.) to ensure or recover supply quickly.
Our goal is to provide consumers with the highest quality products by assuring their performance, consistency and reliability.
Our team values are integral to who we are and how we operate as a company.
All device qualification data, including FIT calculation, is included in the qualification summary document. Contact us or ask your Astera Labs Sales Manager for further information.
- If you need to return potentially defective material, please contact Astera Labs’s Customer Service organization.
- The Quality team will run an evaluation based upon customer-generated diagnostic logs, production test results, and PCIe system testing, and will share the results using an 8D process.
Smart Memory Controllers FAQ
In traditional servers, memory is directly connected to a specific CPU or GPU (i.e., locked behind the host) and can result in over-provisioning of memory resources when applications are not using the available memory. When the memory is over-provisioned to a specific host, the memory is now stranded and cannot be accessed by other hosts, thereby increasing data center costs. In addition, when memory is locked behind a host, the data being processed by the application needs to be copied through high latency interconnects if a different CPU or GPU needs access to the data.
Memory pooling allows multiple hosts in a heterogenous topology to access a common memory address range with each host being assigned a non-overlapping address range from the “pool” of memory resources. Memory pooling allows system integrators to dynamically allocate memory from this pool, which reduces costs by reducing stranded memory and increasing memory utilization. Memory pooling is part of a growing trend for resource disaggregation or composability for heterogeneous solutions.
Memory sharing allows multiple hosts in a heterogeneous topology to access a common memory address range with each host being assigned the same address range as the other host. This improves memory utilization similar to memory pooling, but also provides an added benefit of data flow efficiency since multiple hosts can access the same data. With memory sharing, coherency needs to be managed between the hosts to ensure data is not overwritten by another host incorrectly.
CXL is needed to overcome CPU-memory and memory-storage bottlenecks faced by computer architects. Future data centers need heterogeneous compute, new memory and storage hierarchy, and an agnostic interconnect to tie it all together. CXL maintains memory coherency between the processor memory space and memory on attached devices to enable pooling and sharing of resources to provide higher performance, reduce software stack complexity, and lower overall system cost.
Smart Retimer FAQ
“RAS” is the ability of the system to provide resilience starting from the underlying hardware all the way to the application software through three components collectively referred to as “RAS” features:
- Reliability: the ability of the system to detect and correct faults
- Availability: how the system guarantees uninterrupted operation with minimal degradation
- Serviceability: the ability of the system to proactively diagnose, repair, upgrade or replace components at scale
Aries Smart Cable Modules can support a variety of gauges up to 7 meters.
Aries Smart Cable Modules support the multiple form factors and cable configurations for diverse AI topologies.
“RAS” is the ability of the system to provide resilience starting from the underlying hardware all the way to the application software through three components collectively referred to as “RAS” features:
- Reliability: the ability of the system to detect and correct faults
- Availability: how the system guarantees uninterrupted operation with minimal degradation
- Serviceability: the ability of the system to proactively diagnose, repair, upgrade or replace components at scale
Use IBIS model and time domain simulations.
Bit error rate (BER) is the ultimate gauge of link performance, but an accurate measure of BER is not possible in relatively short, multi-million-bit simulations.
Instead, this analysis suggests the following pass/fail criteria, which consist of two rules:
- A link must meet the receiver’s eye height (EH) and eye width (EW) requirements
- A link must meet criteria 1 for at least half of Tx Preset settings (≥5 out of 10)
- Criteria 1 establishes that the there is a viable set of settings, which results in the desired BER. The specific EH and EW required by the receiver is implementation-dependent.
- Criteria 2 ensures that the link has adequate margin and is not overly sensitive to the Tx Preset setting.
View Signal Integrity Challenges for PCIe 5.0 OCP Topologies Video >
- Determine if a Retimer is needed based on different PCB materials
- Define a simulation space, and identify worst-case conditions (temperature, humidity, impedance, etc.), minimum set of parameters (e.g., Transmitter Presets)
- Define the evaluation criteria, such as minimum eye height/width
- Execute and analyze results
View Signal Integrity Challenges for PCIe 5.0 OCP Topologies Video >
Redrivers and Retimers are active components which impact the data stream: their package imposes signal attenuation, their active circuits apply boost, and (in the case of Retimers) clock and data recovery. As such, there is no way to truly disable these components and still have data pass through. When disabled, no data will pass through a Redriver or Retimer.
A Retimer’s transmitters and receivers, on both pseudo ports, must meet the PCIe Base Specifications. This means that a Retimer can support the full channel budget (nominally 36 dB at 16 GHz) on both sides — before and after the Retimer. Calculating the insertion loss (IL) budget should be done separately for each side of the Retimer, and channel compliance should be performed for each side as well, just as you would do for a Retimer-less Root-Complex-to-Endpoint link.
Redrivers are not defined or specified within the PCIe Base Specification, so there are no formal guidelines for using a Redriver versus using a Retimer.
A Retimer is required to have the same link width on its upstream-facing port and on its downstream-facing port. In other words, the link widths must match. A Retimer must also support down-configured link widths, but the width must always be the same on both ports.
The only notable differences are:
- As with all PCIe 6.x transmitters, the Retimer’s transmitters must support 64 GT/s precoding when requested by the link partner.
- As with all PCIe 6.x receivers, the Retimer’s receivers must support Lane Margining in both time and voltage.
Not quite, each port of a packet switch has a full PCIe protocol stack:
Physical Layer, Data Link Layer, and Transaction Layer.
A packet switch has at least one root port and at least one non-root port.
A Retimer, by contrast, has an upstream-facing Physical Layer and a downstream-facing Physical Layer but no Data Link or Transaction Layer.
As such, a Retimer’s ports are considered pseudo ports because a Retimer does not have — nor does it need — these higher-logic layers, the latency through a Retimer is much smaller compared to the latency through a packet switch.
There are no “special” considerations. During Equalization Phase 2, the Retimer’s upstream pseudo port (USPP) and the Endpoint will simultaneously train their receivers, and they have a total 64 ms at 64 GT/s speed (32 ms at lower speeds) to complete their Phase 2 training. During Equalization Phase 3, the same will happen with the downstream pseudo port (DSPP) and the root complex, and likewise a total of 64 ms at 64 GT/s speed (32 ms at lower speeds) is provided to complete Phase 3 training. The timeouts are the same regardless of whether a Retimer is present or not.
The maximum number to cascade Retimers in a link is 2, which is defined in PCIe specification.
There is no need to fine tune a Retimer EQ setting as it participates in Link Equalization with Root Complex and End Points and automatically fine tunes the receiver EQ.
For PCIe 6.x, 36 dB at 16 GHz pre–channel and 36 dB at 16 GHz post– channel. Based on the PCIe Base Specification, the maximum total insertion loss with one retimer from Root Complex to End Point is 32 dB at 16 GHz, die to die.
A redriver amplifies a signal, whereas a Retimer retransmits a fresh copy of the signal.
There are generally three ways to approach this:
- Channel Loss Budget Analysis
- Simulate channel s-parameter in the Statistical Eye Analysis Simulator (SeaSim) tool to determine if post-equalized eye height (EH) and eye width (EW) meet the minimum eye opening requirements: ≥6 mV EH and ≥3.13 ps EW at Bit Error Ratio (BER) ≤ 10-6. Refer to PCIe Base Specfication Section 8.5.1.
- Consider your cost threshold for system upgrades
Astera Labs’ Aries Smart DSP Retimers offer exceptional robustness, ease-of-use and a list of Fleet Management capabilities. See the comparison table below:
[wptb id=19645]
Astera Labs Aries PCIe Smart Retimers offer exceptional robustness, ease-of-use and a list of Fleet Management capabilities. Get more details >