Xilinx has launched a new FPGA card, the Alveo U50, that it claims can match the performance of a GPU in areas of artificial intelligence (AI) and machine learning.\nThe company claims the card is the industry\u2019s first low-profile adaptable accelerator with PCIe Gen 4 support, which offers double the throughput over PCIe Gen3. It was finalized in 2017, but cards and motherboards to support it have been slow to come to market.\nThe Alveo U50 provides customers with a programmable low-profile and low-power accelerator platform built for scale-out architectures and domain-specific acceleration of any server deployment, on premises, in the cloud, and at the edge.\n\nXilinx claims the Alveo U50 delivers 10 to 20 times improvements in throughput and latency as compared to a CPU. One thing's for sure, it beats the competition on power draw. It has a 75 watt power envelope, which is comparable to a desktop CPU and vastly better than a Xeon or GPU.\nFor accelerated networking and storage workloads, the U50 card helps developers identify and eliminate latency and data movement bottlenecks by moving compute closer to the data.\n Xilinx\n\nXilinx Alveo U50\n\n\nThe Alveo U50 card is the first in the Alveo portfolio to be packaged in a half-height, half-length form factor. It runs the Xilinx UltraScale+ FPGA architecture, features high-bandwidth memory (HBM2), 100 gigabits per second (100 Gbps) networking connectivity, and support for the PCIe Gen 4 and CCIX interconnects. Thanks to the 8GB of HBM2 memory, data transfer speeds can reach 400Gbps. It also supports NVMe-over-Fabric for high-speed SSD transfers.\nThat\u2019s a lot of performance packed into a small card.\nWhat the Xilinx Alveo U50 can do\nXilinx is making some big boasts about Alveo U50's\u00a0capabilities:\n\nDeep learning inference acceleration (speech translation): delivers up to 25x lower latency, 10x higher throughput, and significantly improved power efficiency per node compared to GPU-only for speech translation performance.\nData analytics acceleration (database query): running the TPC-H Query benchmark, Alveo U50 delivers 4x higher throughput per hour and reduced operational costs by 3x compared to in-memory CPU.\nComputational storage acceleration (compression): delivers 20x more compression\/decompression throughput, faster Hadoop and big data analytics, and over 30% lower cost per node compared to CPU-only nodes.\nNetwork acceleration (electronic trading): delivers 20x lower latency and sub-500ns trading time compared to CPU-only latency of 10us.\nFinancial modeling (grid computing): running the Monte Carlo simulation, Alveo U50 delivers 7x greater power efficiency compared to GPU-only performance for a faster time to insight, deterministic latency and reduced operational costs.\n\nThe Alveo U50 is sampling now with OEM system qualifications in process. General availability is slated for fall 2019.