Microsoft is unveiling Maia 200, a new in-house AI inference accelerator designed to sharply reduce the cost of running large language models at scale.

In a new announcement, the company says Maia 200 is engineered specifically for inference, the phase where AI models generate responses in real time based on unseen data.

Inference is a growing cost center for hyperscalers as usage of tools like Copilot and frontier models accelerates.

Maia 200 is built on TSMC’s 3-nanometer process and features native FP8 and FP4 tensor cores, alongside a redesigned memory system with 216GB of HBM3e delivering 7 terabytes per second of bandwidth, plus 272MB of on-chip SRAM to keep models fully utilized during token generation.

Microsoft claims the chip delivers three times the FP4 performance of Amazon’s third-generation Trainium accelerator and FP8 performance that exceeds Google’s seventh-generation TPU, positioning it as the most performant first-party silicon deployed by any hyperscaler.

The company also says Maia 200 improves efficiency, delivering 30% better performance per dollar than the latest generation hardware currently operating across Microsoft’s AI infrastructure.

The accelerator will be deployed as part of Microsoft’s heterogeneous AI stack, serving multiple models across Azure, including OpenAI’s latest GPT-5.2 models, with the goal of lowering inference costs for Microsoft Foundry, Microsoft 365 Copilot and enterprise customers.

Says Microsoft CEO Satya Nadella in a new LinkedIn post,

“Our newest AI accelerator, Maia 200, is now online in Azure… It joins our broader portfolio of CPUs, GPUs, and custom accelerators, giving customers more options to run advanced AI workloads faster and more cost-effectively on Azure.”

Disclaimer: Opinions expressed at CapitalAI Daily are not investment advice. Investors should do their own due diligence before making any decisions involving securities, cryptocurrencies, or digital assets. Your transfers and trades are at your own risk, and any losses you may incur are your responsibility. CapitalAI Daily does not recommend the buying or selling of any assets, nor is CapitalAI Daily an investment advisor. See our Editorial Standards and Terms of Use.

Microsoft Debuts Maia 200 AI Accelerator, Promising Cheaper AI Tokens for GPT-5.2 and Copilot

Google Uses Gemini To Predict and Pinpoint Flash Floods Using 2,600,000 News Reports

Billionaire Bill Ackman Warns Growing AI Inequality Could Become One of the World’s Biggest Threats

Sam Altman Warns US Faces Big Vulnerabilities in Global AI Race, Including AI’s Growing Unpopularity and More

Billionaires Elon Musk and Stanley Druckenmiller Reveal How AI Could Unleash Universal Income

US Commander Confirms Military Using AI Tools in Operations Against Iran

ChatGPT Accused of Posing as Lawyer After Citing Fake Legal Case and Costing Insurance Firm $300,000: Report