I\u2019ve long felt Japan has been severely overlooked in recent years due to two \u201clost decades\u201d and China overshadowing it \u2014 and supercomputing is no exception.\nIn 2011, Fujitsu launched the K computer at the Riken Advanced Institute for Computational Science campus in Kobe, Japan.\u00a0Calling it a computer really is a misnomer, though, as is the case in any supercomputer these days. When I think \u201ccomputer,\u201d I think of the 3-foot-tall black tower a few feet from me making the room warm. In the case of K, it\u2019s rows and rows of cabinets stuffed with rack-mounted servers in a space the size of a basketball court.\nWith its distributed memory architecture, K had 88,128 eight-core SPARC64 VIIIfx processors in 864 cabinets. Fujitsu was a licensee of Sun Microsystems\u2019 SPARC processor (later Oracle) and did some impressive work on the processor on its own. When it launched in 2011, the K was ranked the world's fastest supercomputer on the TOP500 supercomputer list, at a computation speed of over 8 petaflops. It has since been surpassed by supercomputers from the U.S. and China.\nFujitsu sets a new course with an\u00a0ARMv8-A processor\nSupercomputers have a shelf life of three to five years, so it\u2019s time for a replacement. With the demise of the SPARC processor, Fujitsu has decided to chart a new course and last week at the Hot Chips conference announced publication of specifications for the A64FX CPU, an ARMv8-A processor that will be used in the post-K computer that Fujitsu and RIKEN hope will be 100 times faster than K.\nA64FX will be the first CPU to adopt the Scalable Vector Extension (SVE), an extension of ARMv8-A instruction set architecture for supercomputers. Fujitsu worked with ARM, which is owned by Softbank, another Japanese firm, to develop the A64FX.\nThe CPUs will be directly connected by the proprietary Tofu (Torus Fusion) interconnect developed for the K computer. Tofu is designed to improve parallel performance using a six-dimensional mesh\/torus topology, providing scalability of over 100,000 nodes, and full-duplex links that have a peak bandwidth of 10 GB\/sec.\nEach processor can provide a peak double precision (64-bit) floating point operations performance of over 2.7 TFLOPS, or twice that for single precision (32-bit) floating point. Since artificial intelligence (AI) is one of those fields that doesn\u2019t require double-precision, the A64FX will be ideal for supercomputing use \u2014 which requires double-precision \u2014 as well as AI, which can get by with single precision.\nIt\u2019s a beast of a chip, too, with 512-bit SIMD x 2 pipes\/core, comparable to a Xeon or Epyc processor, support for HBM2 memory and 48 cores per chip, plus four assistant cores, all connected by the Tofu interconnect. Memory bandwidth is expected to top 1TB\/sec.\nAll told, Fujitsu expects the chip to be 2.5 times faster for HPC and AI than the previous generation SPARC chip it made, the SPARC XIfx.\nImproved power efficiency\nAt the same time, Fujitsu is aiming for increased power efficiency through an Energy Monitor and Energy Analyzer to enable chip-level monitoring of performance and adjusting accordingly. The \u201cPower Knob\u201d can change the hardware configuration as needed for power use.\nThe big question is will other customers be able to buy it? Fujitsu said it would contribute to the ARM ecosystem, but will anyone outside Japan be able to buy a post-K computer for themselves? Japan likes to keep its best toys for itself. We\u2019ll see. The post-K computer is expected to launch around 2021.