What is CXL, and why should you care?

Compute Express Link can share compute and memory among components and devices, potentially leading to more efficient use of data-center resources.

digital transformation /finger tap causes waves of interconnected digital ripples

If you purchase a server in the next few months featuring Intel’s Sapphire Rapids generation of Xeon Scalable processor or AMD’s Genoa generation of Epyc processors, they will come with a notable new function called Compute Express Link (CXL)—an open interconnect standard you may find useful, especially in future iterations.

CXL is supported by pretty much every hardware vendor and built on top of PCI Express for coherent memory access between a CPU and a device, such as a hardware accelerator, or a CPU and memory.

PCIe is meant for point-to-point communications such as SSD to memory, but CXL will eventually support one-to-many communication by transmitting over coherent protocols. So far, CXL is capable of simple point-to-point communication only.

CXL is currently in its 1.1 iteration, and 2.0 and 3.0 specs have been announced. Because CXL is joined at the hip with PCIe, new versions of CXL are dependent on new versions of PCIe. There is about a two-year gap in between releases of PCIe and then even longer gap between release of a new spec and products coming to market. Right now CXL 1.1 and 2.0 devices are in what are called engineering samples for testing.

CXL protocols

There are three protocols that CXL supports:

CXL.io: An enhanced version of a PCIe 5.0 protocol for initialization, device discovery, and connection to the device.

CXL.cache: This protocol defines interactions between a host and a device, allowing attached CXL devices to efficiently cache host memory with extremely low latency using a request-and-response approach.

CXL.mem: This provides a host processor with access to the memory of an attached device, covering both volatile and persistent memory architectures.

CXL.mem is the big one, starting with CXL 1.1. If a server needs more RAM, a CXL memory module in an empty PCIe 5.0 slot can provide it. There’s slightly lower performance and a little added latency, but the tradeoff is that it provides more memory in a server without having to buy it. Yes, there is slightly lower performance and a little added latency, a small trade off to get more memory in a server without having to buy it. Of course you do have to buy the CXL module.

CXL 2.0 supports memory pooling, which uses memory of multiple systems rather than just one. Microsoft has said that about 50% of all VMs never touch 50% of their rented memory. CXL 2.0 could find that memory and put it to use. Microsoft said that disaggregation via CXL can achieve a 9-10% reduction in overall need for DRAM.

Eventually CXL it is expected to be an all-encompassing cache-coherent interface for connecting any number of CPUs, memory, process accelerators (notably FPGAs and GPUs), and other peripherals.

The CXL 3.0 spec, announced last week at the Flash Memory Summit (FMS), takes that disaggregation even further by allowing other parts of the architecture—processors, storage, networking, and other accelerators—to be pooled and addressed dynamically by multiple hosts and accelerators just like the memory in 2.0.

The 3.0 spec also provides for direct peer-to-peer communications over a switch or even across switch fabric, so two GPUs could theoretically talk to one another without using the network or getting the host CPU and memory involved.

Kurt Lender, co-chair of the CXL marketing work group and a senior ecosystem manager at Intel, said, “It’s going to be basically everywhere. It’s not just IT guys who are embracing it. Everyone’s embracing it. So this is going to become a standard feature in every new server in the next few years.”

So how will the application run in enterprise data centers benefit? Lender says most applications don’t need to change because CXL operates at the system level, but they will still get the benefits of CXL functionality. For exaple, in-memory databases could take advantage of the memory pooling, he said.

Component pooling could help provide the resources needed for AI. With CPUs, GPUs, FPGAs,  and network ports all being pooled, entire data centers might be made to behave like a sinlge system.

But let’s not get ahead of ourselves. We’re still waiting for CXL 2.0 products, but demos at the recent FMS show indicate they are getting close.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Related:

Copyright © 2022 IDG Communications, Inc.