SambaNova Systems, maker of dedicated AI hardware and software systems, has launched a new AI chip, the SN40, that will be used in the company\u2019s full-stack large language model (LLM) platform, the SambaNova Suite.\nFirst introduced in March, the SambaNova Suite uses custom processors and operating systems for AI inference training. It's designed to be an alternative to power-hungry and expensive GPUs.\nTo upgrade the hardware so soon after launch means that there ought to be a big jump in performance, and there is. The SN40L serves up to a 5 trillion parameter LLM with 256K+ sequence length possible on a single system node, according to the vendor.\nEach SN40L processing unit is made up of 51 billion transistors (total 102 billion per package), which is a significant increase over the 43 billion transistors in the previous SN30 product. The SN40L also uses 64 GB of HBM memory, which is new to the SambaNova line, and offers more than 3x greater memory bandwidth to speed data in and out of the processing cores. It has 768 GB of DDR5 per processing unit (1.5 TB total) vs. 512 GB (1.0 TB) in the SN30.\nSambaNova\u2019s processor is different from Nvidia's GPU in that it offers a RDU-based (reconfigurable dataflow unit) environment, which is reconfigurable on-demand, almost like an FPGA. This is helpful when enterprises start dealing with multimodal AI, where they are shifting between different inputs and outputs.\nOn the software side, SambaNova is offering what it calls a turnkey solution for generative AI. SambaNova's full AI stack includes pre-trained, open-source models such as the Meta Llama2 LLM model, which organizations can modify with their own content to build their own internal LLM. It also includes the company's SambaFlow software, which automatically analyzes and optimizes processing based on the needs of the particular tasks.\nDan Olds, chief research officer at Intersect360 Research, said this is a major upgrade both in terms of hardware and, as importantly, the surrounding software stack. He notes that the 5 trillion parameter limit of the SN40 is nearly three times larger than the 1.7 trillion parameter estimated size of GPT-4.\n\u201cThe larger memory, plus the addition of HBM, are key factors in driving the performance of this new processor. With larger memory spaces, customers can get more of their models into main memory, which means much faster processing. Adding HBM to the architecture allows the system to move data between main memory and the cache-like HBM in much larger chunks, which also speeds processing,\u201d said Olds.\nThe ability to run much larger models in relatively small systems and to run multiple models simultaneously with high performance, plus the integration of open-source LLMs to help customers get off the ground quickly with their own generative AI projects, mark a big step forward for SambaNova, Olds said.\n\u201cIt gives them hardware that can truly compete with GPU-based systems on large models and a suite of software that should take a lot of the mystery (and time) out of building a custom LLM for end users,\u201d he said.