Microsoft's New AI Models Challenge OpenAI and Google on Price and Speed

Microsoft AI announced three new foundational models on Thursday that generate text, voice, and images, marking the company's most aggressive push yet to build its own artificial intelligence stack while remaining tied to OpenAI. The release reveals a calculated strategy: compete directly with rival AI labs on cost and performance, even as Microsoft maintains its multibillion-dollar partnership with the ChatGPT maker. The three models, developed by Microsoft's MAI (Microsoft AI) Superintelligence team, are now available on Microsoft Foundry, the company's AI model platform, with pricing that undercuts Google and OpenAI offerings.

What Are Microsoft's Three New AI Models?

The three models address different AI tasks that businesses and developers rely on daily. MAI-Transcribe-1 converts speech into text across 25 languages and operates 2.5 times faster than Microsoft's previous Azure Fast offering, according to the company's announcement. MAI-Voice-1 generates audio, allowing users to create 60 seconds of speech in just one second and build custom voices for personalized applications. MAI-Image-2, a video-generating model, was initially released on MAI Playground, Microsoft's new large language model testing software, on March 19 before becoming more widely available.

These models represent the output of Microsoft's MAI Superintelligence team, an AI research division formed and announced in November 2025 and led by Mustafa Suleyman, CEO of Microsoft AI. The team's focus reflects a philosophy that Suleyman outlined in the company's announcement.

"At Microsoft AI, we're building Humanist AI. We have a distinct view when creating our AI models, putting humans at the center, optimizing for how people actually communicate, training for practical use," stated Mustafa Suleyman, CEO of Microsoft AI.

Mustafa Suleyman, CEO of Microsoft AI

How to Understand Microsoft's Pricing Strategy?

  • Transcription Costs: MAI-Transcribe-1 starts at $0.36 per hour of audio processed, making it accessible for businesses that need to convert large volumes of recorded content into searchable text.
  • Voice Generation Pricing: MAI-Voice-1 begins at $22 per 1 million characters, allowing developers to generate custom audio at scale without prohibitive expenses.
  • Video and Image Output: MAI-Image-2 starts at $5 for 1 million tokens of text input and $33 for 1 million tokens of image output, positioning it as a cost-competitive option for visual content creation.

Microsoft explicitly marketed these models as cheaper alternatives to those offered by Google and OpenAI, a critical selling point in an increasingly crowded market where pricing has become a key differentiator. The company is betting that developers and enterprises will choose its models not just for performance but for affordability.

Why Is Microsoft Pursuing Its Own Models While Partnering With OpenAI?

The apparent contradiction between building independent AI models and maintaining a deep partnership with OpenAI reflects Microsoft's broader strategy of controlling multiple layers of its AI infrastructure. Much like the company produces its own chips while also purchasing from external suppliers, Microsoft is developing proprietary models while remaining invested in OpenAI's research and products.

Suleyman reaffirmed Microsoft's commitment to OpenAI in an interview with VentureBeat, emphasizing that the partnership remains central to the company's AI strategy. However, he also revealed to The Verge that a recent renegotiation of the partnership agreement gave Microsoft greater freedom to pursue its own superintelligence research initiatives. This renegotiation appears to have been the catalyst for the MAI Superintelligence team's formation and the subsequent release of these three models.

Microsoft has invested more than $13 billion into OpenAI and hosts its models across various Microsoft products through a multiyear agreement. The new models do not replace this relationship but rather complement it, allowing Microsoft to offer customers choices and reduce dependency on any single AI provider. This hedging strategy protects Microsoft's position in a rapidly evolving AI landscape where technological advantages can shift quickly.

The timing of these releases underscores the intensity of AI competition in 2026. With OpenAI, Google, Anthropic, and other labs racing to build more capable and efficient models, Microsoft's decision to accelerate its own research signals confidence in its technical capabilities and a willingness to compete on multiple fronts simultaneously. The MAI Superintelligence team's work suggests that Microsoft believes it can differentiate on cost, speed, and practical usability, even as it maintains its partnership with one of the industry's most advanced AI research organizations.