After months of speculation that Microsoft was developing its own semiconductors, the company at its annual Ignite conference last Wednesday took the covers off two new custom chips, dubbed the Maia AI Accelerator and the Azure Cobalt CPU, which target generative AI and cloud computing workloads, respectively.
The new Maia 100 AI Accelerator, according to Microsoft, will power some of the company's heaviest internal AI workloads running on Azure, including OpenAI’s model training and inferencing workloads.
Sam Altman, the CEO of Microsoft-backed OpenAI, claimed in a news release that the custom Maia chip has paved the way for the AI company to train more capable models in a way that will result in lower costs for end customers.
Analysts agreed with that assessment. “Microsoft is creating their own AI processors to improve the performance per watt and performance per dollar versus Nvidia’s offerings,” said Dylan Patel, chief analyst at semiconductor research and consulting firm Semianalysis. The reduction in cost will ultimately be passed on to customers subscribing to Azure’s AI and generative AI offerings, he said.
The Azure Cobalt 100 CPU, which is built on Arm architecture, is also an attempt by Microsoft to make its infrastructure more energy efficient when compared to commercial AMD and Intel CPUs, according to Patel.
The Arm architecture of Cobalt 100 CPU allows Microsoft to generate more computing power for each unit of energy consumed, the company said, adding that these chips will be used across its data centres.
“We’re making the most efficient use of the transistors on the silicon. Multiply those efficiency gains in servers across all our data centers, it adds up to a pretty big number,” Wes McCullough, corporate vice president of hardware product development at Microsoft, said in a news release.
Microsoft is announcing the news at a time when public cloud spending is expected to grow significantly.
End-user spending on public cloud services is forecast to grow 20.4 per cent to total US$678.8 billion in 2024, up from US$563.6 billion in 2023, according to a report from Gartner.
New way to cool the new Maia 100 Accelerator chips
Microsoft had to create a new design for its data centre racks to house the Maia 100 Accelerator chips inside its data centers. The racks, which are wider than existing ones, have been expanded to leave ample space for both power and networking cables, the company said, adding that a separate liquid cooling solution, different than the existing air-cooling methods, had to be designed to manage the temperature of the chips due to intensive AI and generative AI workloads.
To implement liquid cooling, Microsoft has developed a “sidekick” that sits next to the Maia 100 chips’ rack. These sidekicks, according to Microsoft, work a bit like a radiator in a car.
“Cold liquid flows from the sidekick to cold plates that are attached to the surface of Maia 100 chips. Each plate has channels through which liquid is circulated to absorb and transport heat. That flows to the sidekick, which removes heat from the liquid and sends it back to the rack to absorb more heat, and so on,” a company spokesperson said.
Economics, sustainability key drivers of custom chips
Economics, and not chip shortages, are the key driver for custom chips for large cloud service providers, such as Microsoft, AWS, and Google, according to analysts.
“Microsoft’s decision to develop custom silicon, from the point of view of economics, allows it to integrate its offerings and enables the company to continue to optimise silicon for its services while also increasing margin and having better control of costs and availability,” said Daniel Newman, CEO of The Futurum Group.
These same reasons, according to Newman, resulted in AWS developing its own custom chips. While AWS has its Inferentia chips paired with the Trainium machine learning accelerator, Google has been developing iterations of its Tensor chips.
“The Cobalt CPU is all about Microsoft offering cloud optimised silicon and being able to offer Arm based instances to Azure customers much the way AWS is with EC2,” Newman said.
Additionally, analysts believe that the new chips provide a window of opportunity for Microsoft to build its own AI accelerator software frameworks as demand for AI or generative AI grows further.
“Building accelerators for AI workloads will be a way to improve performance while using less power than other chips such as graphics processing units (GPUs). Increasing performance while being energy efficient will continue to be more important for vendors and enterprises as attempt to meet sustainability goals and benefit from the potential of AI,” Newman said.
Custom chips to give Nvidia, AMD and Intel a run for their money
Microsoft’s new custom chips are not powerful enough to replace GPUs from Nvidia for the purposes of developing large language models. But they are well-suited for inferencing — being used in operational AI workloads — and as they roll out they will reduce the need for the company to use chips from Nvidia, AMD and Intel, analysts said, adding that custom chips from AWS and Google will also challenge chipmakers in the future.
“Intel, NVIDIA and AMD are all seeing the rise of Arm based instances and should see them as a competitive threat in certain instances,” Newman said.
The migration of workloads from x86 architecture chips to Arm isn't plug and play yet — since software is often written for specific chip architectures — but has become less of a sticking point as developers continue to make progress in running more and more workloads on Arm, the Futurum Group's Newman said.
Analysts say that with cloud service providers using custom silicon at varying levels, the data center market will see a “more meaningful shift” to Arm in the coming years despite x86 chips currently dominating market share by a substantial margin.
Among all chipmakers, Newman believes that Nvidia will be the least impacted, at least in the near term, as demand for its GPUs is set to remain elevated.
However, in some instances or use cases the custom chips from cloud service providers may see a symbiotic relationship with Nvidia, especially the Grace Hopper chips, which are targeted towards developing and training large language models.
Microsoft’s new custom chips are expected to start rolling out in early next year to its data centers. Since Microsoft does not plan to sell the chips to third parties, they will not run into the restrictions imposed by the Biden administration on tech exports to China.