The speed of Meta's Research Super Computer would dwarf that of the current world's fastest supercomputer. Credit: Gerd Altmann Facebook’s parent company Meta said it is building the world’s largest AI supercomputer to power machine-learning and natural language processing for building its metaverse project. The new machine, called the Research Super Computer (RSC), will contain 16,000 Nvidia A100 GPUs and 4,000 AMD Epyc Rome 7742 processors. It has 2,000 Nvidia DGX-A100 nodes, with eight GPU chips and two Epyc microprocessors per node. Meta expects to complete construction this year. RSC is already partially built, with 760 of the DGX-A100 systems deployed. Meta researchers have already started using RSC to train large models in natural language processing (NLP) and computer vision for research with the goal of eventually training models with trillions of parameters, according to Meta. “Meta has developed what we believe is the world’s fastest supercomputer. We’re calling it RSC for AI Research SuperCluster, and it’ll be complete later this year. The experiences we’re building for the metaverse require enormous compute power (quintillions of operations/second!) and RSC will enable new AI models that can learn from trillions of examples, understand hundreds of languages, and more,” said CEO Mark Zuckerberg in an emailed statement. RSC is expected to hit a peak performance of 5 exaFLOPS at mixed precision processing, both FP16 and FP32, which would rocket it to the top of the Top500 supercomputer list whose top performing supercomputer can hit 442 Pflop/s. It is being built in partnership with Penguin Computing, a specialist in HPC systems. Meta is not disclosing where the system is located. “RSC will help Meta’s AI researchers build new and better AI models that can learn from trillions of examples; work across hundreds of different languages; seamlessly analyze text, images, and video together; develop new augmented reality tools; and much more,” Kevin Lee, a technical program manager, and Shubho Sengupta, a software engineer, both at Meta, wrote in a blog post. “We hope RSC will help us build entirely new AI systems that can, for example, power real-time voice translations to large groups of people, each speaking a different language, so they can seamlessly collaborate on a research project or play an AR game together,” they wrote. In addition to all of the processing power, RSC also has to 175 petabytes in Pure Storage FlashArray, 46 petabytes in a cache storage, and 10 petabytes of Pure’s object storage equipment. RSC is estimated to be nine times faster than Meta’s previous research cluster, made up of 22,000 of Nvidia’s older generation V100 GPUs, and 20 times faster than its current AI systems. Meta does not plan to retire the old system. The company is focused on building learning models for automated tasks focused around content. It wanted this infrastructure in order to train models with more than a trillion parameters on data sets as large as an exabyte, with the goal of getting its arms around all the content generated on its platform. “By doing this, we can help advance research to perform downstream tasks such as identifying harmful content on our platforms as well as research into embodied AI and multimodal AI to help improve user experiences on our family of apps. We believe this is the first time performance, reliability, security, and privacy have been tackled at such a scale,” Lee and Sengupta wrote. Related content news analysis AMD launches Instinct AI accelerator to compete with Nvidia AMD enters the AI acceleration game with broad industry support. First shipping product is the Dell PowerEdge XE9680 with AMD Instinct MI300X. By Andy Patrizio Dec 07, 2023 6 mins CPUs and Processors Generative AI Data Center news analysis Western Digital keeps HDDs relevant with major capacity boost Western Digital and rival Seagate are finding new ways to pack data onto disk platters, keeping them relevant in the age of solid-state drives (SSD). By Andy Patrizio Dec 06, 2023 4 mins Enterprise Storage Data Center news Omdia: AI boosts server spending but unit sales still plunge A rush to build AI capacity using expensive coprocessors is jacking up the prices of servers, says research firm Omdia. By Andy Patrizio Dec 04, 2023 4 mins CPUs and Processors Generative AI Data Center news AWS and Nvidia partner on Project Ceiba, a GPU-powered AI supercomputer The companies are extending their AI partnership, and one key initiative is a supercomputer that will be integrated with AWS services and used by Nvidia’s own R&D teams. By Andy Patrizio Nov 30, 2023 3 mins CPUs and Processors Generative AI Supercomputers Podcasts Videos Resources Events NEWSLETTERS Newsletter Promo Module Test Description for newsletter promo module. Please enter a valid email address Subscribe