Original: Swyx · 03/06/2026
Summary
Microsoft announced seven new MAI models at Build, including MAI-Thinking-1, showcasing advancements in AI technology and transparency in model development.Key Insights
“Microsoft used Build to position itself as both an AI platform company and a frontier-model lab.” — Overview of Microsoft’s strategy during the Build event.
“The flagship reasoning model MAI-Thinking-1 was presented as Microsofts first reasoning model, built with clean data lineage.” — Description of the MAI-Thinking-1 model.
“Microsoft claims 97% on AIME 2025 and 53% on SWE-Bench Pro.” — Performance metrics for the MAI-Thinking-1 model.
Topics
Full Article
Today was a big day, not least because we caught up on the state of GitHub vs Agents, and recorded a special pod with No Priors and Satya Nadella at MS Build, Satya and Mustafa announced 7 new MAI models:This is an impressive lineup, especially considering that the Microsoft-Inflection deal that set up MAI only happened 2 years ago, and that these are all from-scratch pretrains. MAI today is by no means an unqualified frontier lab, but it is a good tier 2 neolab with obvious incentives to support domain specific finetunes (as opposed to the frontier labs who have ~all killed finetuning).The star of the show was the 100+ page MAI tech report, which the research community is giving glowing reviews:You can catch up on all the rest of the announcement in the excellent Verge recap, and the tweet summaries below:AI News for 06/1/2026-6/2/2026. We checked 12 subreddits, 544 Twitters and no further Discords. AINews website lets you search all past issues. As a reminder, AINews is now a section of Latent Space. You can opt in/out of email frequencies!AI Twitter RecapTop Story: Microsoft Build recap, and new MAI model technical detailsWhat happenedMicrosoft used Build to position itself as both an AI platform company and a frontier-model lab, pairing broad product launches with unusually detailed disclosures about its new MAI model family.Microsoft AI announced seven new MAI models spanning reasoning, code, image, speech transcription, and voice, led by MAI-Thinking-1, MAI-Code-1-Flash, MAI-Image-2.5, MAI-Transcribe-1.5, and MAI-Voice-2 according to @MicrosoftAI and @mustafasuleymanThe flagship reasoning model MAI-Thinking-1 was presented as Microsofts first reasoning model, built with clean data lineage and zero distillation from third-party models in posts from @mustafasuleyman, @baseten, @tuhinone, and @HannaHajishirziMicrosoft released a 109-page technical report for MAI-Thinking-1, which drew strong positive reactions from technically oriented readers for its level of transparency, including @eliebakouch, @ethanCaballero, @nrehiew_, @yacinelearning, and @stochasticchasmMicrosoft also emphasized local AI and agent-native Windows: Build messaging highlighted secure execution layers for agents, a new Surface RTX Spark Dev Box, Windows AI access to the broader Windows GPU install base, and concept hardware such as Project Solara/Scout, summarized by @yusuf_i_mehdi, @TheTuringPost, @kimmonismus, and @kimmonismusBuild also included a major GitHub Copilot app push as the desktop home for agent-native software development, with canvases, cross-device continuity, and tighter GitHub agent workflows, from @pierceboggan, @lukehoban, and reactions from @techgirl1908Microsoft introduced Web IQ, a new grounding/search API stack for AI agents, claiming the APIs already power nearly all AI agents and chatbots in the industry today, including Copilot and ChatGPT, via @JordiRib1Satya Nadella framed Build as an ecosystem moment rather than a single-product launch, while Mustafa Suleyman framed it as the output of Microsofts internal hill-climbing machine, in @satyanadella, @mustafasuleyman, and reaction from @nrehiew_MAI model family: disclosed facts and technical detailsMAI-Thinking-1Microsoft described MAI-Thinking-1 as a 35B active parameter MoE with a 256K context window in @mustafasuleymanA separate summary from @scaling01 says the model is a 1T@35B parameter model, pre-trained on 30T tokens, and trained using 8192 GB200 GPUs; this appears to be a reading of the technical report rather than Microsoft marketing copy@kimmonismus similarly summarized it as a mid-size MoE with 45B active params, but this conflicts with Mustafas own 35B active figure; the more authoritative figure in the tweet set is the official 35B active numberMicrosoft claims 97% on AIME 2025 and 53% on SWE-Bench Pro, with blind human raters on Surge preferring it overall to Sonnet 4.6, from @mustafasuleyman and @asadovskyMicrosoft says the model is optimized on MAIA 200, with 30% better performance per dollar and 1.4x performance-per-watt gain versus GB200 when running MAI models end-to-end, per @mustafasuleymanMicrosoft and partners repeatedly stressed no third-party distillation, clean data lineage, and enterprise-controlled fine-tuning with 100% eyes-off post-training data through Baseten, in @baseten, @tuhinone, and @MicrosoftAIMAI-Code-1-FlashMicrosoft introduced MAI-Code-1-Flash as a fast coding model for VS Code and GitHub Copilot CLI, first announced by @pierceboggan and later highlighted by @mariorod1Official Microsoft messaging via @mustafasuleyman says Code-1-Flash achieves 51% on SWE-Bench Pro despite having just 5B parameters, positioning it near Haiku-class size/costA competing summary from @scaling01 describes it as a 137B parameter MoE, 256K context, trained on 10T+ tokens, and stronger and more efficient than Claude 4.5 Haiku. That likely indicates 5B active parameters rather than total parameters; the tweets do not fully reconcile this distinction, but together imply small active footprint within a much larger MoEAvailability at launch was highlighted as GitHub Copilot / VS Code-first, per @scaling01 and @mariorod1MAI-Image-2.5Microsoft launched MAI-Image-2.5 and a Flash variant, claiming both reached #2 on leaderboards, with @mustafasuleyman saying they surpass Nano Banana 2 on image editingIndependent leaderboard accounts supported the high ranking: @arena reported #2 in Image Edit Arena with score 1401, +10 points over Nano Banana 2, Grok Imagine, and ChatGPT Image Latest HF@arena further said MAI-Image-2.5 advances the Pareto frontier, meaning no model at its price tier scores higher on that benchmarkDistribution partners quickly followed, including @OpenRouter and @falMAI-Transcribe-1.5@ArtificialAnlys reported MAI-Transcribe-1.5 as an unusually strong speed/accuracy point on the STT frontier: ~276x realtime, 2.4% AA-WER, #3 overall on its leaderboardThe model supports 43 languages, including English, French, Arabic, Japanese, and Chinese, and supports keyword biasing for rarer terms such as names and medical terminology, per @ArtificialAnlysPricing was reported as 18,000 to 18, in @harvey, @hwchase17, and @nikogrupenW&B relaunched Weave as agent-first observability with integrations across common harnesses and automated detection of failure modes, in @wandb and @neutralino1Prime-RL integrated Mooncake Store with vLLM for cross-node prefix / KV cache reuse, pitched as key for agentic rollouts, in @m_sirovatkaTogether detailed serving optimizations for MiniMax-M3, citing 81125% throughput improvements via KV-block-major sparse attention, paged decode, optimized index scoring, and multimodal preprocessing, in @togethercomputeMiniMax itself highlighted 1M context, native multimodality, desktop-computer operation, and MSA reducing attentions share of decode time from ~30% to ~5%, in @MiniMax_AIEcosystem, hardware, and industrial capacityWestmag emerged from stealth to build American robot actuators and drone motors, with 11M raised led by a16z and participation from Founders Fund, Lux, NFDG, Menlo and others, in @boxcardavid, @packyM, and @oyhsuPyTorch noted NVIDIA adoption of OpenMDW-1.1, a permissive AI-model licensing framework, across four open-model families in @PyTorchMartin Scorsese publicly demonstrated narrow, preproduction use of FLUX for storyboarding with Black Forest Labs, framed as exploratory and complementary to hand-drawn work rather than generative replacement, in @robrombach and @TheRundownAIAI Reddit Recap/r/LocalLlama + /r/localLLM Recap1. NVIDIA Nemotron 3 Ultra and RTX Spark Specs Read moreRelated Articles
Microsoft's new MAI models
Simon Willison · explanation · 82% similar
[AINews] Tasteful Tokenmaxxing
Swyx · explanation · 78% similar
[AINews] Reve 2 and Ideogram 4: Layouts in Imagegen
Swyx · explanation · 77% similar
Originally published at https://www.latent.space/p/ainews-microsoft-build-mai-thinking.