As demand for generative AI grows, cloud service providers such as Microsoft, Google and AWS, along with large language model (LLM) providers such as OpenAI, have all reportedly considered developing their own custom chips for AI workloads.\nSpeculation that some of these companies \u2014 notably OpenAI and Microsoft \u2014 have been making efforts to develop their own custom chips for handling generative AI workloads due to chip shortages have dominated headlines for the last few weeks. \u00a0\u00a0\nWhile OpenAI is rumored to be looking to acquire a firm to further its chip-design plans, Microsoft is reportedly working with AMD to produce a custom chip, code-named Athena.\nGoogle and AWS both have already developed their own chips for AI workloads in the form of Tensor Processing Units (TPUs), on the part of Google, and AWS' Trainium and Inferentia chips.\nBut what factors are driving these companies to make their own chips? The answer, according to analysts and experts, lies around the cost of processing generative AI queries and the efficiency of currently available chips, mainly graphics processing units (GPUs). Nvidia's A100 and H100 GPUs currently dominate the AI chip market.\n\u201cGPUs are probably not the most efficient processor for generative AI workloads and custom silicon might help their cause,\u201d said Nina Turner, research manager at IDC.\nGPUs are general-purpose devices that happen to be hyper-efficient at matrix inversion, the essential math of AI, noted Dan Hutcheson, vice chairman of TechInsights.\n\u201cThey are very expensive to run. I would think these companies are going after a silicon processor architecture that\u2019s optimized for their workloads, which would attack the cost issues,\u201d Hutcheson said.\nUsing custom silicon, according to Turner, may allow companies such as Microsoft and OpenAI to cut back on power consumption and improve compute interconnect or memory access, thereby lowering the cost of queries.\nOpenAI spends approximately $694,444 per day or 36 cents per query to operate ChatGPT, according to a report from research firm SemiAnalysis.\n\u201cAI workloads don't exclusively require GPUs,\u201d Turner said, adding that though GPUs are great for parallel processing, there are other architectures and accelerators better suited for such AI-based operations.\nOther advantages of custom silicon include control over access to chips and designing elements specifically for LLMs to improve query speed, Turner said.\nDeveloping custom chips is not easy\nSome analysts also likened the move to design custom silicon to Apple\u2019s strategy of producing chips for its devices. Just like Apple made the switch from general purpose processors to custom silicon in order to improve performance of its devices, the generative AI service providers are also looking to specialize their chip architecture, said Glenn O'Donnell, research director at Forrester.\n\u201cDespite Nvidia's GPUs being so wildly popular right now, they too are general-purpose devices. If you really want to make things scream, you need a chip optimized for that particular function such as image processing or specialized generative AI,\u201d O\u2019Donnell explained, adding that custom chips could be the answer for such situations.\nHowever, experts said that developing custom chips might not be an easy affair for any company.\n\u201cSeveral challenges, such as high investment, long design and development lifecycle, complex supply chain issues, talent scarcity, enough volume to justify the expenditure and lack of understanding of the whole process, are impediments to developing custom chips,\u201d said Gaurav Gupta, vice president and analyst at Gartner. \u00a0\nFor any company that is just kickstarting the process from scratch, it might take at least two to two and a half years, O\u2019Donnell said, adding that scarcity of chip designing talent is a major factor behind delays.\nO\u2019Donnell\u2019s perspective is backed by examples of large technology companies acquiring startups to develop their own custom chips or partnering with companies that have expertise in the space. AWS acquired Israeli startup Annapurna Labs in 2015 to develop custom chips for its offerings. Google, on the other hand, partners with Broadcom to make its AI chips.\nChip shortage might not be the main issue for OpenAI or Microsoft\nWhile OpenAI is reportedly looking to acquire a startup to make a custom chip that supports its AI workloads, experts believe that the plan might not be linked to chip shortages, but \u00a0more about supporting inference workloads for LLMs, as Microsoft keeps adding AI features into apps and signing up customers for its generative AI services\n\u201cThe obvious point is that they have some requirement nobody is serving, and I reckon it might be an inference part that\u2019s cheaper to buy and cheaper to run than a big GPU, or even the top Sapphire Rapids CPUs, without making them beholden to either AWS or Google,\u201d according to Omdia principal analyst Alexander Harrowell. He added that he was basing his opinion on CEO Sam Altman\u2019s comments that GPT-4 is unlikely to scale further, and would rather need enhancing. Scaling an LLM requires more compute power when compared to inferencing a model. Inferencing is the process of using a trained LLM to generate more accurate predictions or results.\nFurther, analysts said that acquiring a large chip designer might not be a sound decision for OpenAI as it would approximately cost around $100 million to design and get the chips ready for production.\n\u201cWhile OpenAI can try and raise money from the market for the effort, the deal with Microsoft earlier this year essentially led to selling an option over half the company for $10 billion, of which some unspecified proportion is in non-cash Azure credits \u2014 not the move of a company that\u2019s rolling in cash,\u201d Harrowell said.\nInstead, the ChatGPT\u00a0maker can look at acquiring startups that have AI accelerators, Turner said, adding that such a move would be more economically advisable.\nIn order to support inferencing workloads, potential targets for acquisition could be Silicon Valley firms such as Groq, Esperanto Technologies, Tenstorrent and Neureality, Harrowell said, adding that SambaNova could also be a possible acquisition target if OpenAI is willing to discard Nvidia GPUs and move on-premises from a cloud-only approach.