AsteraLabs-Blue-Grey-Hz
  • Applications
  • Products
    • Product Overview

      Built in the cloud, for the cloud.

      • Hardware Solutions
      • Design Services
      • Buy and Sample
    • Aries PCIe®/CXL™ Smart Retimers

      Industry-proven Smart Retimers for PCI Express® (PCIe) 4.0, PCIe 5.0, and Compute Express Link™ (CXL) systems

    • Taurus Ethernet Smart Cable Modules™

      Overcome reach, signal integrity, and bandwidth utilization issues for 100G/Lane Ethernet connectivity

    • Leo CXL Memory Connectivity Platform Pre-production

      CXL-attached memory expansion, pooling, and sharing for cloud servers

  • Cloud-Scale Interop Lab
    • Aries PCIe/CXL Smart Retimers

      Learn how Astera Labs enables you to deploy PCIe and CXL systems with confidence.

    • Leo CXL Memory Connectivity Platform

      Learn how Astera Labs enables you to deploy CXL-attached memory with confidence.

  • Technology Insights
    • Articles

      Browse our knowledge base articles for information about our products and technologies.

    • Video Center

      Explore our educational and technical video center to support your design needs.

    • Document Library

      Find app notes, white papers and more in our technical resource library.

    • FAQs

      Find answers to the most frequently asked questions about our products and technologies.

    • Webinars

      Created by engineers for engineers, our webinar series explores the most important topics related to hyperscale datacenters.

  • Careers
  • About
    • About Us
    • Team
    • Support Portal    
    • Quality
    • News & Articles
    • Events
  • Contact

CXL

Home » CXL

The Importance of Security Features in a CXL Memory Controller to Protect Mission-Critical Cloud Data

February 23, 2023 by Amy Thomas

The explosion of modern applications such as Artificial Intelligence, Machine Learning and Deep Learning is changing the very nature of computing and transforming businesses. These applications have opened myriad ways for companies to improve their business development processes, operations, and security and to provide better customer experiences. To support these applications, platforms are being designed to utilize SoCs that can process large data sets in cloud data centers, have specialized processing power to service the use cases, create customized solutions, and scale this market. The market size of AI was valued at $65.48 billion in 2020, and is projected to reach $1,581.70 billion by 2030, growing at a CAGR of 38.0% from 2021 to 2030, according to a recent report by Allied Market Research.

With this exponential growth comes rising concerns about the security on these platforms running mission-critical applications in emerging markets such as healthcare, automotive, and data analytics. Security is one of the major factors that is contributing to the complexity as well as cost of development and maintenance of these systems. In fact, per the 2022 report published by IBM Security, the global average total cost of a data breach increased to USD 4.35 million in 2022 and is the highest in history. What’s even more concerning is that it took an average of 207 days to identify the breach and 70 days to contain it.

The complexity of the threats by malicious actors who can breach these platforms has been increasing significantly over the past decade and it needs to be addressed with concrete security measures at the hardware, software, and protocol level on platform SoCs.

 

Figure 1: Average total cost of a data breach in USD millions (Source: IBM Security, 2022)

 

Security Threats Plaguing Cloud-Centric SoCs

The challenge for chipmakers is not only to develop high performance SoCs for cloud applications, but also to have features that can counter the sophistication of the threat vectors to secure confidential and sensitive assets on the platforms. The big question companies now ask is how much security is enough and, even if a device starts out with its security intact, will it remain secure throughout its lifetime.

This became evident with recently discovered vulnerabilities: Meltdown, Spectre, and Foreshadow which were based on speculative execution and branch prediction. Through such incidents, we have discovered how attackers use sophisticated attack vectors to breach a system. These attack vectors include:

  • Attacks on data-in-transit such as intrusion attacks using sniffing devices, which can lead to data leakage or alteration of code or data being transmitted on high-speed links.
  • Attacks on data-in-use such as side-channel attacks which include Electromagnetic (EM) attack and Differential power analysis (DPA), where the information during code execution is exploited to alter the device behavior.
  • Attacks on data-at-rest such as availability related threats, including disruption or Denial-of-Service (DoS) attacks against stored data in systems.

 

Figure 2: Security attack vectors on data

 

As these attacks become more sophisticated, next generation interconnect standards such as Compute Express LinkTM (CXL™) are also continuously adapting to protect against these threats by defining better security protocols to provide data confidentiality, integrity, and data encryption (IDE) mechanisms transiting a CXL link.

 

Security Features Needed in a CXL-based Memory Controller

Cloud based applications such as AI and ML require SOCs that can increase memory bandwidth to unlock the performance required for next-generation data centers. Compute Express Link™ is an open standard developed to provide high-speed, low-latency, cache-coherent interconnect for processors, accelerators, and memory expansion. CXL Type-3 memory controllers can provide a cost-effective and high-performance solution to expand memory bandwidth and capacity. Additionally, to protect against attacks described earlier, a CXL Type-3 device also needs to implement security features using cryptographic techniques defined in CXL 2.0 specification as well as other industry standard data encryption, authentication mechanisms. The following sections describe some of the important security features that a CXL-based memory controller needs to implement to protect sensitive assets in a data center.

 

CXL 2.0 IDE

Considering the modern threat vectors, the CXL Consortium, in close collaboration with other industry-standard bodies such as PCI-SIG and Distributed Management Task Force (DMTF), has incorporated the Integrity and Data Encryption (IDE) features in the CXL 2.0 specification. IDE features are designed to provide confidentiality, integrity, and replay protection at a FLIT level (Flow Control Units). It defines Message Authentication Code (MAC) which are designed to protect against attacks such as interception of packets between point-to-point devices CXL links. While security is an essential requirement, system designers must also consider the performance needs of their systems when enabling IDE. To address the balance between performance and security, CXL 2.0 specification defines two IDE modes:

  • Containment Mode where the data is released for further processing only after the integrity check passes. This mode impacts both latency and bandwidth. The bandwidth impact comes from the fact that integrity value is sent quite frequently.
  • Skid Mode where the data will be released for further processing without waiting for the integrity value to be received and checked. This allows for less frequent transmission of the integrity value. Skid mode allows for near zero latency and very low bandwidth overhead.

 

Hardware-based Root of Trust (RoT)

An immutable hardware-based Root-of-Trust (RoT) is essential for implementing an entity that can be trusted to always behave in the expected manner and is the foundation upon which all further security layers are created. To ensure that all the layers involved in device operation are secure, it is imperative to extend the circle of trust from hardware-based RoT to every single component that stores firmware and configuration settings used by the device.

 

Secure Boot

Extending security from RoT requires implementation of a secure boot mechanism to verify the integrity of every code being loaded on the device before it’s allowed to execute. Secure boot process uses asymmetric private-public key pair. A private key is used with a corresponding asymmetric public key in a cryptographic algorithm for computation and verification of digital signatures. The private key is uniquely associated with the owner, is not made public, and is used for generation of the digital signature of the data. The public key is used to verify a digital signature that was signed using the corresponding private key. Since the public key itself isn’t considered a device secret, it is made public. Immutable RoT enforces authentication of the next stage mutable bootloader by checking the code for proper signature by an approved signer. Secure boot is considered successful if the integrity check passes and fails if it doesn’t. This process is repeated on all other layers of the firmware.

 

Memory Encryption

Memory encryption is an important feature for a CXL-based Memory Controller since it interfaces with off-chip memory devices to enable memory expansion, pooling and sharing. Encrypting memory is one of the most reliable techniques to prevent data being accessed across different guests/domains/zones/realms.
AES-XTS, is the de-facto cryptographic algorithm for protecting the confidentiality of data-at-rest on storage devices. It is a standards-based symmetric algorithm defined by NIST SP800-38E and IEEE Std 1619-2018 specifications. Advanced memory encryption technologies also involve integrity and protocol level anti-replay techniques for high-end use-cases. DRAM inline cipher engines protect data in use for secure memory transactions at high data rates between hosts and attached memory. With memory encryption in place, even if any of the isolation techniques have been compromised, the data being accessed is still protected by cryptography and it prevents physical attacks like a hardware bus probing on the interface.

 

Conclusion: Leo Memory Connectivity Platform Provides End-to-End Security

Security is essential for high-performance CXL interconnects to protect private and sensitive user information transmitted on the links. Leo Memory Connectivity Platform provides a complete set of end-to-end security features to protect mission-critical user data. Leo’s security features provide data confidentiality, integrity, and data encryption (IDE) mechanisms transiting a CXL link, as well as additional security features to ensure modern cloud-based systems and valuable user data are protected. These security measures apply to a wide variety of use models, offer broad interoperability, and align to industry best practices.

 

 

To learn more about Leo Memory Connectivity Platform, please visit www.AsteraLabs.com/Leo.

References

  • CXL 2.0: IDE for CXL.cache, CXL.mem protocols
  • Distributed Management Task Force
  • Open Compute Platform Specification

Filed Under: CXL Memory Connectivity Tagged With: CXL

Connectivity Is Key to Harnessing the Data Reshaping Our World

November 10, 2021 by Susan Nayak

by Jitendra Mohan

We are surrounded by data, from images captured on our mobile phones and videos streamed online to elaborate sensor fusion data required for autonomous driving. As shown in Figure 1, it is estimated that in 2020, 500 hours of video were uploaded to YouTube every minute and >400,000 hours of video were streamed every minute from Netflix alone.

a-minute-on-the-internet
Figure 1: A snapshot of content generated in one minute on various websites and applications (Source: Visual Capitalist via Statista)

Every facet of our life, especially in this post COVID-19 world, from shopping online to interacting with friends and family over social media is made possible by data – and the amount of data we generate is only expected to increase dramatically. In fact, the total amount of data is expected to grow to 175 Zettabytes (1 ZB = 1 trillion gigabytes) by 2025, up from 33 ZB in 2018 (Figure 2).

Indeed, our world is going through a digital transformation with the prevalence of Big Data as we monitor and digitize everything and systematically extract information from raw data to gain valuable insights.

annual-size-of-the-global-datasphere
Figure 2: Amount of data expected to be used through the year 2025 (Source: Data Age 2025, sponsored by Seagate with data from IDC Global DataSphere, Nov 2018)

Big Data Lives in the Cloud

Along with explosion of data creation, there is an accompanying shift in how the data is stored and analyzed. While data creation continues to happen at the endpoints, edge, and increasingly in the cloud, a disproportionate amount of this data is stored in the cloud. As shown in Figure 3, IDC estimates that by 2025, nearly 50% of worldwide data will be stored in public clouds. Public clouds will also account for nearly 60% of worldwide server deployments, thereby providing necessary compute power to process all the data stored in the cloud. Cloud Service Providers (CSPs) operate large data centers that excel at storing and managing big data while providing storage and computing on-demand to end-users. Gartner estimates that by next year (2022), public clouds will be essential for 90% of data and analytics innovation.

where-is-the-data-stored
Figure 3: Data is increasingly being stored in Public Clouds (Source: Data Age 2025, sponsored by Seagate with data from IDC Global DataSphere, Nov 2018)

AI/ML Complexity Doubles Every 4 Months

Artificial Intelligence (AI) workloads running in the cloud bring together data storage and data analytics to address business problems that would otherwise be impossible to tackle.

Gartner estimates that by 2025, 75% of enterprises will operationalize AI to provide insights and predictions in complex business situations. This proliferation of AI requires that the underlying models be quick, reliable, and accurate. Machine Learning (ML), especially Deep Learning (DL), combines AI algorithms, big training data, and purpose-built heterogeneous compute hardware to handle extremely large and constantly evolving datasets.

While ML has been around for decades, over the last eight to ten years, ML model complexity has far outpaced Moore’s law bound advancements in a single compute node. Figure 4 depicts a historical compilation of the compute requirements of training AI systems showing that ML complexity is doubling approximately every 3.4 months, compared with Moore’s law of doubling of transistor count in ICs every two years!

Not surprisingly, the industry is investing considerable resources and effort in distributing ML workloads over multiple compute units within a server and across multiple servers in a large data center.

ml-complexity-growing-exponentially-1024x681
Figure 4: ML complexity growing exponentially in the Modern Era far outpacing Moore’s Law (Source: OpenAI) 

800G Ethernet and Compute Express Link (CXL) Enable Distributed ML

Distributing ML workloads across multiple compute nodes is a challenging problem. Running massive parallel distributed training over state-of-the-art interconnects from just a few years ago highlights the connectivity bottleneck where 90% of wall time is spent in communication and synchronization overhead. The industry has responded with advances in interconnect technology, both in terms of increased data rates and reduced latencies to drive resource disaggregation and move large amounts of data within and across multiple servers.

Ethernet is the dominant interconnect technology to connect various servers across a data center. A typical data center networking topology connects a variety of servers over copper interconnects for North-South data traffic patterns and optical interconnects for East-West data traffic patterns. While existing 200G (8x25G) Ethernet interconnects are based on 25Gbps NRZ signaling rates, upcoming deployments in 2023 are designed for 800G (8x100G) interconnects based on 100Gbps PAM-4 signaling to quadruple the available interconnect bandwidth across servers.

Unlike Ethernet, there has not been a dominant industry standard cache coherent interface for connectivity within the server. Over the last five years, several standards like CCIX, Gen-Z, OpenCAPI, NVLink, and more recently, Compute Express Link™ (CXL™) have outlined cache coherent interconnects for distributing processing of ML workloads. Of these, NVLink is a proprietary standard used largely by Nvidia devices. CXL, on the other hand, has gained wide industry adoption in a short time to emerge as the unified cache coherent server interconnect standard.

CXL defines a scalable high bandwidth (16x32Gbps), low latency (nano seconds) cache coherent interconnect fabric running over the ubiquitous PCI Express® (PCIe®) interconnect. The CXL protocol, first introduced by Intel in 2019, has now established itself as the industry standard to interconnect various processing and memory elements within a server as well as to enable disaggregated composable architectures within a rack.

CXL 1.1 defines multiple device types that implement cxl.io, cxl.cache, and cxl.mem protocols. CXL 2.0, introduced in 2020 by the CXL Consortium, adds capabilities for a switching fabric for fanout and extended support for memory pooling for increased capacity and bandwidth. CXL 3.0, currently in development, will further enable peer-to-peer connectivity and even higher throughput when combined with PCIe 6.0.

Intelligent Cloud Connectivity Solutions

Over the last three years, Astera Labs introduced a portfolio of CXL, PCIe, and Ethernet connectivity solutions to remove performance bottlenecks created by ML workloads in complex heterogeneous topologies. Astera’s industry leading products, address the challenges of connectivity and resource sharing within and across servers. These solutions, in the form of ICs and boards, are purpose-built for the cloud and offer the industry’s highest performance, broad interoperation, deep diagnostics, and cloud-scale fleet management features.

Connectivity Solutions

  • Aries Smart Retimers: First introduced in 2019 alongside CXL 1.1 standard, the Aries portfolio of PCIe 4.0/5.0, and CXL 2.0 Smart Retimers overcome challenging signal integrity issues while delivering sub-10ns latency and robust interoperation.
  • Taurus Smart Cable Module™ (Taurus SCM™): Taurus SCMs enable Ethernet connectivity at 200GbE/400GbE/800GbE over a thin copper cable while providing a robust supply chain necessary for cloud-scale deployments.

Memory Accelerator Solutions

  • Leo CXL Memory Connectivity Platform: Industry’s first CXL SoC solution implementing cxl.mem protocol, the Leo CXL Memory Accelerator Platform allows a CPU to access and manage CXL-attached DRAM and persistent memory, enabling the efficient utilization of centralized memory resources at scale without impacting performance.

cxl-1-1-helps-implement-protocols
Figure 5: CXL 1.1 helps implement CXL.io, CXL.cache, and CXL.mem protocols (Source: CXL Consortium via Venture Beat)

Conclusions

As our appetite for creating and consuming massive amounts of data continues to grow, so too will our need for increased cloud capacity to store and analyze this data. Additionally, the server connectivity backbone for data center infrastructure needs to evolve as complex AI and ML workloads become mainstream in the cloud. Astera Labs and our expanding portfolio of CXL, PCIe, and Ethernet connectivity solutions are essential to unlock the higher bandwidth, lower latencies, and deeper system insights needed to overcome performance bottlenecks holding back data-centric systems.

Filed Under: CXL Memory Connectivity Tagged With: CXL

Search

Categories

  • Cloud-Scale Interop
  • Corporate
  • CXL Memory Connectivity
  • Smart Cable Modules
  • Smart Retimers

Archives

  • March 2023
  • February 2023
  • January 2023
  • September 2022
  • August 2022
  • November 2021
  • October 2021
  • August 2021
  • December 2020
  • June 2020
  • March 2020
  • October 2019
  • June 2019
  • Aries PCIe/CXL Smart Retimers
  • Taurus Ethernet Smart Cable Modules
  • Leo CXL Memory Connectivity Platform
  • Cloud-Scale Interop Lab
  • Applications
  • Quality
  • Technology Insights
  • Contact Us
  • Careers
  • News & Articles
  • Support Portal    
Subscribe for Updates
Please enter your name.
Please enter a valid email address.
Subscribe

Thanks for subscribing! 

Something went wrong. Please check your entries and try again.

By submitting this form, you are consenting to receive emails from Astera Labs. You can revoke your consent at any time by using the Unsubscribe link found at the bottom of every email.

AsteraLabs-WhitewBug-Hz

Copyright © 2023 Astera Labs, Inc. All rights reserved.

Site Map I Privacy Policy I Terms of Use | Terms of Sale