Our initial look at Intel\u2019s Architecture Day focused on the new Xeons and IPU processors. Now we\u2019ll get into the fine details, as well as look at other upcoming technologies.\nSapphire Rapids\nIntel\u2019s upcoming next-generation Xeon is codenamed Sapphire Rapids and promises a radical new design and gains in performance. One of its key differentiators is its modular SoC design. The chip has multiple tiles that appears to the system as a monolithic CPU and all of the tiles communicate with each other, so every thread has full access to all resources on all tiles.\n\nIn a way it\u2019s similar to the chiplet design AMD uses in its Epyc processor. By breaking the monolithic chip up into smaller pieces it\u2019s easier to manufacture.\nIn addition to faster\/wider cores and interconnects, Sapphire Rapids has a new feature called Last Level Cache (LLC) that features up to 100MB of cache that can be shared across all cores, with up to four memory controllers and eight memory channels of DDR5 memory, next-gen Optane Persistent Memory, and\/or High Bandwidth Memory (HBM).\nSapphire Rapids also offers Intel Ultra Path Interconnect 2.0 (UPI), a CPU interconnect used for multi-socket communication. UPI 2.0 features four UPI links per processor with 16GT\/s of throughput and supports up to eight sockets.\nWith so much new, high-performance technology in Sapphire Rapids it should blow the current generation out of the water. Sapphire Rapids-generation Xeon Scalable Processors are due to arrive early next year.\nPonte Vecchio: Molto Bene\nIntel is determined to get into the GPU business and is not letting Nvidia\u2019s dominance deter it. The Xe architecture is its third attempt after the disastrous Larrabee and Xeon Phi, and on paper, this one looks like it has a chance.\nLike Nvidia and AMD, Intel is making multiple GPUs for different markets. One is for PC clients and aimed squarely at gamers. Another is Ponte Vecchio, a Xe-based GPU optimized for HPC and AI workloads.\nPonte Vecchio is an insanely complex piece of silicon, perhaps Intel\u2019s biggest ever. Ponte Vecchio processors have more than 100 billion transistors (the new Xeon Scalable reportedly has 300 million) and uses five different process nodes and multiple foundries to make what are called specialized tiles.\nAll told, a Ponte Vecchio SoC has 47 active tiles including: compute, an HPC-oriented specialized cache called Rambo, HBM, Xe Link and a specialized high-speed interconnect called EMIB tiles. It uses a 3D packaging architecture called Foveros and specialized multi-tile packaging. What\u2019s interesting is that some of the tiles are made by TSMC while others are made by Intel.\nFor the first time, Intel revealed initial performance data. It claims Ponte Vecchio silicon supports 45 TFLOP of FP32 throughput, which is vital for AI training, greater than\u00a05TBps\u00a0memory fabric bandwidth, and greater than 2TBps\u00a0connectivity. By contrast, Nvidia\u2019s new Ampere architecture offers peak FP32 performance of 19.5 TFLOPs.\nLooks like Nvidia's CEO Jen-Hsun poked the bear once too often.\nLike Sapphire Rapids, Ponte Vecchio will be released next year.