Skip to main content
Original: Swyx · 11/03/2026

Summary

Yann LeCun’s AMI Labs has launched with a $1.03B seed funding to develop AI models that understand the physical world, marking a significant milestone in AI development.

Key Insights

“AMI aims to build AI models that understand the physical world.” — Describing the mission of AMI Labs as articulated by Yann LeCun.
“This is LeCun finally getting the capital and team to prove his long-argued alternative to LLM-centric AI.” — Supportive view on the significance of AMI Labs’ funding and vision.
“Intelligent agents need hierarchical representations to understand the world.” — LeCun’s critique of pure autoregressive LLMs and the need for grounded understanding.

Topics


Full Article

AI News for 3/9/2026-3/10/2026. We checked 12 subreddits, 544 Twitters no more Discord (see below). Estimated reading time saved (at 200wpm): 2649 minutes. AINews’ website lets you search all past issues. As a reminder, AINews is now a section of Latent Space. You can opt in/out of email frequencies!Most days, AINews op-ed is human written, while the sections below are human curated selections from multiple LLM generations. However, some days theres one clear big story, and in those cases, weve been developing a new methodology that reports the big story in more detail. Today is one of those days GPT 5.4 won todays battle in describing AI Twitter coverage of AMI. So, read on, feedback welcome. Ed.AI Twitter RecapTop Story: Yann LeCuns AMI Labs launches with a 1.03BseedtobuildworldmodelsaroundJEPAWhathappenedYannLeCunformallyunveiledAdvancedMachineIntelligence(AMILabs),anewstartupfocusedonbuildingrealintelligenceintotherealworld,withanunusuallylarge1.03B seed to build world models around JEPAWhat happenedYann LeCun formally unveiled Advanced Machine Intelligence (AMI Labs), a new startup focused on building real intelligence into the real world, with an unusually large 1.03B seed round (also cited as 890M) at a reported 3.5B pre-money valuation, described as one of the largest seed rounds ever and likely the largest for a European company. The announcement came directly from LeCun, who said the company had completed one of the largest seeds ever and was hiring @ylecun, and from CEO Alex Lebrun, who framed the mission as a long-term scientific endeavor to build systems that truly understand the real world @lxbrun. Multiple press reports converged on the same core facts: AMI aims to build AI models that understand the physical world and reflects LeCuns long-running view that human-level AI will come from world modeling rather than scaling language prediction alone @TechCrunch @WIRED @business @Reuters @ZeffMax. The founding and senior team includes LeCun; Alex Lebrun as CEO @lxbrun; Saining Xie as cofounder/CSO @sainingxie; Laurent Solly as COO @laurentsolly; Pascale Fung as Co-Founder and Chief Research & Innovation Officer @pascalefung; plus a wave of prominent founding researchers joining to work specifically on world models, representation learning, pretraining, scaling, and video @sanghyunwoo1219 @jihanyang13 @duchao0726 @zhouxy2017 @jingli9111.**Facts vs. opinionsFacts reported across tweets and coverageFunding size: 1.03B seed / 890M @ylecun @lxbrun @laurentsolly.Valuation: 3.5Bpremoneywasreportedbycommentatorsandnewssummaries@iScienceLuvr@ZeffMax.Companythesis:buildAImodelsthatcanunderstandthephysical/realworld,notjustlanguage@TechCrunch@WIRED@Reuters.LeCunspositioning:worldmodelshavebeenhispublicthesisforyears;AMIisthevehicletotestitatstartupscale@ZeffMax@WIRED.OfficiallanguagefromAMIleaders:realintelligenceintotherealworld,humancentered,perceives,learns,reasonsandacts@BrianBoLi@pascalefung.Hiring/openinglocations:ParisexplicitlymentionedbyPascaleFung@pascalefung;observersalsonotedZrichamonglocations@giffmana.Europe/Franceangle:FrenchmediaandpoliticalfiguresframeditasamajorEuropean/FrenchAImilestone@BFMTV@France24fr@EmmanuelMacron@NicolasDufourcq.OpinionsandinterpretationSupportiveview:thisisLeCunfinallygettingthecapitalandteamtoprovehislongarguedalternativetoLLMcentricAI@teortaxesTex.Bullishtechnicalview:worldmodelswillbeahugeleapforward,especiallyforembodiment/robotics,andAMIsopenresearchpostureisattractive@mervenoyann@zivravid.Architecturewarframing:somecommentatorsexplicitlycastAMIasabetthattheindustryisbuildingonthewrongfoundationbyoverindexingonautoregressivelanguagemodels@LiorOnAI.Skeptical/neutralview:thekeyquestionisnotwhetherworldmodelssoundcompelling,butwhetherJEPAstylemethodscanscaleintoeconomicallyusefulsystemsfasterthanLLMcentricagentsarealreadycommercializing.Thisskepticismismoreimplicitthanexplicitinthetweetset,butappearsthroughgetsachancetoprovehisvisionstylecomments@teortaxesTex.Metacommentary:AMIisnotbeingframedinternallyasaconventionallab@sainingxie,whichsuggestsanattempttodifferentiatefromthestandardfrontierlabpatternofAPIfirstmodelscaling.Technicaldetails:JEPA,worldmodels,andwhythisisdifferentfromnexttokenLMsAMIspublicnarrativeisalignedwithLeCunsJEPA/worldmodelagenda.Theexplicittechnicaldetailsinthetweetsaresparse,butthediscussionstronglypointstothefollowingstackofideas:Worldmodels:latentpredictivemodelsofenvironmentdynamicsthatlearncompactstaterepresentationsandpredictfuturestates/outcomesratherthanrawsensorystreams.JEPA:JointEmbeddingPredictiveArchitecture,introducedbyLeCunin2022,highlightedincommentaryasamethodthatlearnsabstractrepresentationsandpredictsinacompressedlatentspaceratherthantryingtoreconstructeverypixel/token@LiorOnAI.MotivationforJEPAovergenerativemodeling:Realworldsensorstreamscontainlotsofunpredictableorirrelevantentropy.Rawpixel/videopredictionisinefficientbecauseitspendsmodelingcapacityonnoise.Predictinglatentabstractionsmaybettersupportplanning,controllability,andinvariance.Actionconditionedworldmodels:commentarynotedthekeyextensionthatmodelsshouldpredictconsequencesofactions,enablingplanningbeforeacting@LiorOnAI.ThatisclosertomodelbasedRL/controlthantopassivesequencemodeling.Targetdomainsrepeatedlyimplied:Robotics/embodiedAI@mervenoyannHealthcareandlowerhallucinationsystems@kimmonismusIndustrialprocesscontrol/safetycriticalenvironments@LiorOnAIMoregenerally,systemsthatmusttrackpersistentstate,causality,andactionoutcomesinthephysicalworld.ThisisbroadlyconsistentwithLeCunslongstandingcritiqueofpureautoregressiveLLMs:textpredictionaloneisnotsufficientforgroundedunderstanding,theworldisonlypartiallypredictable,intelligentagentsneedhierarchicalrepresentationsandplanninginlatentspace,datafromvision/video/embodimentshoulddominatelongrunAIprogress.TeamcompositionasatechnicalsignalThefoundingrosterisitselfatechnicalclue.Severalhiresemphasize:worldmodels@sanghyunwoo1219@zhouxy2017pretraining,scaling,video,representation@jingli9111aclusterofvisionheavyresearchers,notedbysupportersasateamofvisions@mervenoyannThatsuggestsAMIislikelytoemphasizevision/video/selfsupervisedrepresentationlearning,notjustappendworldmodellanguagetoanotherwisestandardLLMstack.OpenresearchpostureSeveralsupportivereactionsspecificallymentionedhopeforopenreleases/openresearch@mervenoyann@mervenoyann.ThatmattersbecauseJEPA/worldmodelworkhashistoricallyhadstrongeracademicthanproducttraction;opennesswouldhelpAMIrecruitandshapearesearchecosystem.Butatlaunchthisisstillaspirationratherthandemonstratedpractice.Differentopinionsinthereactionset1)Stronglysupportive:LeCunfinallygetstoruntheexperimentAsizableshareofreactionsareessentiallyreliefthatLeCunnowhasadedicatedstartupandcapitalbasetovalidatehisworldview.Yanngetsachancetoprovehisvision@teortaxesTexverybullishworldmodelswillbeahugeleapforward@mervenoyannsuperbullishonAMIlabsbecauseofteamqualityandopenresearchambition@zivravidunderstandingtherealworldiskeytobuildingadvancedAIsystems@duchao0726ThiscampseesAMIasanoverduecounterweighttothecurrentindustryequilibriumaroundautoregressiveLMs+RLHF+tooluse.2)Architecturewarframing:LLMspredictwords;AMIwantsmodelsofrealityThisviewwasmostexplicitlyarticulatedby@LiorOnAI:languagemodelsoperateoverwords/tokens,realityiscontinuous,sensorimotor,andpartlyunpredictable,generativemodelsoverfittoreconstruction,JEPApredictsmeaningfulabstractionsinstead.ThisistheclearestproAMItechnicalargumentinthetweetset.Ittreatshallucination,brittleness,andlackofgroundedplanningassymptomsofthewrongtrainingobjective,notjustinsufficientscale.3)Pragmaticneutral:Compellingthesis,butnowithastoshipSomereactionsarecelebratorybutnotcredulous:getsachancetoprovehisvision@teortaxesTexburningquestionPyTorchorJAXshop@giffmanaThelatterisnotjustjokeinfrastructurechatter;itreflectsarealquestionforhowAMIwilloperationalizeresearch.Astartupattemptingnovelworldmodeltrainingatscalemustchooseanecosystemoptimizedeitherfor:fastresearchiterationandbroadhiringfamiliarity(PyTorch),oraggressivelargescalefunctionalprogrammingstyleandSPMDcompilerstacks(JAX).4)Broadersimulation/worldmodelenthusiasmoutsideAMITheAMIlaunchalsolandedintoabroaderdiscoursewheresimulationisthenextfrontierwasalreadyintheair.PercyLiangarguedthatthenextbigopportunityistoputsocietyintoadockercontainerviasimulationmodelsthatcanpredictwhathappensinhypotheticalrealworldscenarios@percyliang.ThatisntaboutAMIdirectly,butitreinforceswhyLeCunsthesiscurrentlyresonates:manyresearchersincreasinglythinkprogressrequiresmovingfromtokenimitationtomodelbasedpredictionofenvironmentsandinteractions.Context:whythismattersnowAMImattersbecauseitisahighprofile,wellcapitalizedattempttoreopenaquestionmanyinindustryhadtacitlydeclaredsettled:isnexttokenpredictionthecentralpathtoadvancedintelligence,orjustausefulbutultimatelynarrowsubstrate?WhythetimingisnotableThelaunchcomeswhen:LLMsandcodingagentsarecommerciallysuccessful,multimodalsystemsareimprovingfast,robotics/autonomy/worldmodellanguageisresurging,andthereisgrowingawarenessthatbenchmarkgainsintext/codemaynotdirectlytranslatetophysicalworldcompetence.ThismattersespeciallybecausefrontierAIdiscourselatelyhasbeendominatedby:agents/harnesses/tooluse,reasoningRL,codingautomation,andinferenceinfrastructure.AMIisanexplicitbetthatthenextfrontierisgroundedrepresentationlearningandpredictivemodelingoftherealworld,notjustbetterwrappersaroundtextmodels.WhyLeCunisuniquelypositionedLeCunhasspentyearspubliclyarguing:humanandanimalintelligenceislearnedfromobservationandactionintheworld,languageistoolowbandwidthandderivativetobethemaintrainingsignal,systemsneedlatentvariableworldmodelsandplanning.HisinfluencemadehimoneofthemostvisibleskepticsofLLMsalonegetustoAGI.AMIisthereforenotjustanotherstartup;itisthemostdirectinstitutionalizationsofaroftheantitokenmaximalistviewfromoneofthefieldsmostprominentfigures.Europe/FranceimplicationsPoliticalandinstitutionalreactionsinFrance/Europewereunusuallystrong:MacroncelebrateditasanewpageforAIandlaFrancedeschercheurs,desbtisseurs@EmmanuelMacronBpifrancesNicolasDufourcqhighlightedFrenchprideinbackingacompanythatcouldrevolutionizeglobalAI@NicolasDufourcqSoAMIisalsobeingpositionedasaEuropeanstrategicAIchampion,notmerelyaresearchstartup.AllrelevantAMI/worldmodeltweetsandwhateachadds@TechCrunch:headlineconfirmationofthe3.5B pre-money was reported by commentators and news summaries @iScienceLuvr @ZeffMax.Company thesis: build AI models that can understand the physical/real world, not just language @TechCrunch @WIRED @Reuters.LeCuns positioning: world models have been his public thesis for years; AMI is the vehicle to test it at startup scale @ZeffMax @WIRED.Official language from AMI leaders: real intelligence into the real world, human-centered, perceives, learns, reasons and acts @Brian_Bo_Li @pascalefung.Hiring/opening locations: Paris explicitly mentioned by Pascale Fung @pascalefung; observers also noted Zrich among locations @giffmana.Europe/France angle: French media and political figures framed it as a major European/French AI milestone @BFMTV @France24_fr @EmmanuelMacron @NicolasDufourcq.Opinions and interpretationSupportive view: this is LeCun finally getting the capital and team to prove his long-argued alternative to LLM-centric AI @teortaxesTex.Bullish technical view: world models will be a huge leap forward, especially for embodiment/robotics, and AMIs open-research posture is attractive @mervenoyann @ziv_ravid.Architecture-war framing: some commentators explicitly cast AMI as a bet that the industry is building on the wrong foundation by over-indexing on autoregressive language models @LiorOnAI.Skeptical/neutral view: the key question is not whether world models sound compelling, but whether JEPA-style methods can scale into economically useful systems faster than LLM-centric agents are already commercializing. This skepticism is more implicit than explicit in the tweet set, but appears through gets a chance to prove his vision style comments @teortaxesTex.Meta-commentary: AMI is not being framed internally as a conventional lab @sainingxie, which suggests an attempt to differentiate from the standard frontier-lab pattern of API-first model scaling.Technical details: JEPA, world models, and why this is different from next-token LMsAMIs public narrative is aligned with LeCuns JEPA/world-model agenda. The explicit technical details in the tweets are sparse, but the discussion strongly points to the following stack of ideas:World models: latent predictive models of environment dynamics that learn compact state representations and predict future states/outcomes rather than raw sensory streams.JEPA: Joint Embedding Predictive Architecture, introduced by LeCun in 2022, highlighted in commentary as a method that learns abstract representations and predicts in a compressed latent space rather than trying to reconstruct every pixel/token @LiorOnAI.Motivation for JEPA over generative modeling:Real-world sensor streams contain lots of unpredictable or irrelevant entropy.Raw-pixel/video prediction is inefficient because it spends modeling capacity on noise.Predicting latent abstractions may better support planning, controllability, and invariance.Action-conditioned world models: commentary noted the key extension that models should predict consequences of actions, enabling planning before acting @LiorOnAI. That is closer to model-based RL/control than to passive sequence modeling.Target domains repeatedly implied:Robotics / embodied AI @mervenoyannHealthcare and lower-hallucination systems @kimmonismusIndustrial process control / safety-critical environments @LiorOnAIMore generally, systems that must track persistent state, causality, and action outcomes in the physical world.This is broadly consistent with LeCuns longstanding critique of pure autoregressive LLMs:text prediction alone is not sufficient for grounded understanding,the world is only partially predictable,intelligent agents need hierarchical representations and planning in latent space,data from vision/video/embodiment should dominate long-run AI progress.Team composition as a technical signalThe founding roster is itself a technical clue. Several hires emphasize:world models @sanghyunwoo1219 @zhouxy2017pretraining, scaling, video, representation @jingli9111a cluster of vision-heavy researchers, noted by supporters as a team of vision s @mervenoyannThat suggests AMI is likely to emphasize vision/video/self-supervised representation learning, not just append world-model language to an otherwise standard LLM stack.Open research postureSeveral supportive reactions specifically mentioned hope for open releases/open research @mervenoyann @mervenoyann. That matters because JEPA/world-model work has historically had stronger academic than product traction; openness would help AMI recruit and shape a research ecosystem. But at launch this is still aspiration rather than demonstrated practice.Different opinions in the reaction set1) Strongly supportive: LeCun finally gets to run the experimentA sizable share of reactions are essentially relief that LeCun now has a dedicated startup and capital base to validate his worldview.Yann gets a chance to prove his vision @teortaxesTexvery bullish world models will be a huge leap forward @mervenoyannsuper bullish on AMI labs because of team quality and open research ambition @ziv_ravidunderstanding the real world is key to building advanced AI systems @duchao0726This camp sees AMI as an overdue counterweight to the current industry equilibrium around autoregressive LMs + RLHF + tool use.2) Architecture-war framing: LLMs predict words; AMI wants models of realityThis view was most explicitly articulated by @LiorOnAI:language models operate over words/tokens,reality is continuous, sensorimotor, and partly unpredictable,generative models overfit to reconstruction,JEPA predicts meaningful abstractions instead.This is the clearest pro-AMI technical argument in the tweet set. It treats hallucination, brittleness, and lack of grounded planning as symptoms of the wrong training objective, not just insufficient scale.3) Pragmatic neutral: Compelling thesis, but now it has to shipSome reactions are celebratory but not credulous:gets a chance to prove his vision @teortaxesTexburning question PyTorch or JAX shop @giffmanaThe latter is not just joke infrastructure chatter; it reflects a real question for how AMI will operationalize research. A startup attempting novel world-model training at scale must choose an ecosystem optimized either for:fast research iteration and broad hiring familiarity (PyTorch), oraggressive large-scale functional-programming style and SPMD compiler stacks (JAX).4) Broader simulation/world-model enthusiasm outside AMIThe AMI launch also landed into a broader discourse where simulation is the next frontier was already in the air. Percy Liang argued that the next big opportunity is to put society into a docker container via simulation models that can predict what happens in hypothetical real-world scenarios @percyliang. That isnt about AMI directly, but it reinforces why LeCuns thesis currently resonates: many researchers increasingly think progress requires moving from token imitation to model-based prediction of environments and interactions.Context: why this matters nowAMI matters because it is a high-profile, well-capitalized attempt to reopen a question many in industry had tacitly declared settled: is next-token prediction the central path to advanced intelligence, or just a useful but ultimately narrow substrate?Why the timing is notableThe launch comes when:LLMs and coding agents are commercially successful,multimodal systems are improving fast,robotics/autonomy/world-model language is resurging,and there is growing awareness that benchmark gains in text/code may not directly translate to physical-world competence.This matters especially because frontier AI discourse lately has been dominated by:agents/harnesses/tool use,reasoning RL,coding automation,and inference infrastructure.AMI is an explicit bet that the next frontier is grounded representation learning and predictive modeling of the real world, not just better wrappers around text models.Why LeCun is uniquely positionedLeCun has spent years publicly arguing:human and animal intelligence is learned from observation and action in the world,language is too low-bandwidth and derivative to be the main training signal,systems need latent-variable world models and planning.His influence made him one of the most visible skeptics of LLMs alone get us to AGI. AMI is therefore not just another startup; it is the most direct institutionalization so far of the anti-token-maximalist view from one of the fields most prominent figures.Europe/France implicationsPolitical and institutional reactions in France/Europe were unusually strong:Macron celebrated it as a new page for AI and la France des chercheurs, des btisseurs @EmmanuelMacronBpifrances Nicolas Dufourcq highlighted French pride in backing a company that could revolutionize global AI @NicolasDufourcqSo AMI is also being positioned as a European strategic AI champion, not merely a research startup.All relevant AMI/world-model tweets and what each adds@TechCrunch: headline confirmation of the 1.03B raise and world-model framing.@BFMTV: French-language mainstream framing of the raise as historic.@WIRED: contextualizes LeCuns long-running thesis that physical-world mastery, not language alone, is the route to human-level AI.@business: Bloomberg confirmation of the funding magnitude.@iScienceLuvr: adds the 3.5B pre-money valuation figure.@sainingxie: AMI is not a conventional lab, and Xie joins as cofounder/CSO.@lxbrun: CEO announcement; mission is long-term scientific effort toward real-world understanding.@ZeffMax: concise summary that AMI is LeCun betting big on world models after years of advocacy.@teortaxesTex: gets a chance to prove his vision.@Brian_Bo_Li: real intelligence into the real world slogan.@sanghyunwoo1219: joined from day one specifically to work on world models.@laurentsolly: COO announcement; repeats funding and next AI frontier models.@mavenlin: enthusiasm from another team member, signaling depth of founding bench.@crystalsssup: notes Saining Xies presence as a signal of AMIs seriousness.@ylecun: official unveiling; one of the largest seeds ever, likely largest for a European company.@jihanyang13: founding-team join announcement.@giffmana: asks whether AMI becomes a PyTorch or JAX shop.@France24_fr: French media framing as a paradigm shift.@TheRundownAI: short summary of beyond language models to build world models.@pascalefung: Fung joins as CRIO; emphasizes human-centered AI that perceives, learns, reasons, acts.@EmmanuelMacron: political endorsement and national strategic framing.@franceinter: media amplification around LeCuns broader claims about jobs and AI transformation.@mervenoyann: bullish on world models as a leap forward for embodied research and likes the open stance.@kimmonismus: adds healthcare/Nabla commercialization angle and hallucination-risk framing.@pascalefung: hiring for Paris team.@zhouxy2017: founding member working on world models.@Reuters: calls AMI an alternative AI approach.@NVIDIAAI and related Thinking Machines/NVIDIA posts are not about AMI; omitted from focus.@chris_j_paxton: notes absence of Bay Area in listed locations; suggests geographic differentiation.@giffmana: clarifies Zrich is one of the locations.@lilianweng: building technologies for better human-AI collaboration on next gen hardware at scale. Indirect but clearly tied to joining/working with the AMI orbit.@Yuchenj_UW: juxtaposes LeCuns world-model startup and Metas Moltbook acquisition, highlighting the contrast between long-horizon foundational bets and near-term agent/social-product bets.@LiorOnAI: the most explicit technical gloss on JEPA and why latent-space predictive modeling may matter.@sainingxie: appreciation reply; minor but confirms continued engagement.@NandoDF @DrJimFan @denisyarats: peer congratulations; low-information but signal broad respect.Bottom lineAMI Labs is the strongest institutional challenge yet to the idea that scaling autoregressive language models is the sole or dominant route to AGI. The hard facts are unusually concrete 1.03B seed, 3.5B pre-money, elite vision/world-model-heavy team, France/Europe strategic backing while the technical promise remains largely thesis-level for now: JEPA-style latent predictive world models that learn from real-world sensor data and support planning/action without reconstructing every bit of noise. Supporters view it as the overdue next paradigm; neutrals see a high-stakes test of whether LeCuns critique of LLMs can finally cash out in products and benchmarks; skeptics, even when not stated bluntly, will judge it on whether world models can outcompete rapidly improving LLM agents before the market closes around the current stack.Other TopicsAgents, coding workflows, and the builder vs reviewer shiftA broad theme across the timeline is that coding agents are changing software org structure: implementation is no longer the bottleneck; review, architecture, and product judgment are @renilzac @clairevo @dexhorthy. Multiple reactions converged on the framing that engineers increasingly become either builders with product taste or reviewers with systems thinking @radek__w @ZhitaoLi224653.Agent harnesses emerged as a major practical concept: Agent = Model + Harness, with filesystems, memory, browsers, routing, orchestration, and sandboxes all part of the real product surface @Vtrivedy10 @techczech @AstasiaMyers @omarsar0.Tooling updates reflected that trend:VS Code Agent Hooks for policy enforcement and workflow guidance @codeGitHub/Figma MCP closes designcode loops @githubLangGraph deploy and LangGraph 1.1 simplify productionization @LangChain @sydneyrunkleTogether MCP server and Together GPU Clusters add infra for agent-driven app building and scale @togethercompute @togethercomputeOllama scheduled prompts in Claude Code adds simple automation loops @ollamaProduct reactions were split between enthusiasm and caution:Perplexity Computer replacing routine knowledge work and marketing tasks was cited as a strong founder use case @GabbbarSingh @AravSrinivas @AravSrinivasBut several posts warned against optimizing for % AI-written code or abandoning code comprehension entirely @karrisaarinen @dexhorthy.UX matters as much as raw capability: Claude Code/Hermes/OpenClaw users repeatedly noted trust, feedback loops, memory, and interface presentation as key to perceived competence @StudioYorktown @sudoingX @cz_binance.Benchmarks, evals, and reliability researchCameron Wolfe posted a practical stats thread on making LLM evals more reliable: model scores as sample means, estimate standard error as std / sqrt(n), and report 95% confidence intervals as x 1.96SE instead of raw mean-only metrics @cwolferesearch @cwolferesearch.New benchmark work focused on grounding and human validity:Opposite-Narrator Contradictions for sycophancy @LechMazurOfficeQA Pro: enterprise grounded reasoning remains hard, with frontier agents still <50% @kristahopsalong @DbrxMosaicAISWE-bench Verified appears overstated relative to maintainer reality: maintainers would merge only about half of agent PRs that pass the grader @whitfill_parker @joel_bkrAuditBench introduces 56 LLMs with implanted hidden behaviors for alignment-auditing evaluation @abhayesianCodeClash probes long-horizon coding/planning; top models still fare poorly in sustained agentic adversarial settings @OfirPress @OfirPressInterpretability of reasoning traces continues to be contested: one paper summary claimed 97%+ of thinking steps are decorative and CoT monitoring is unreliable @shi_weiyan.Models, infrastructure, and training systemsMegatron Core MoE drew strong attention as an open framework for large-scale MoE training, with a claim of 1233 TFLOPS/GPU for DeepSeek-V3-685B @EthanHe_42 @eliebakouch. Commentary suggested DeepSeek-style MoE training efficiency is becoming commoditized @teortaxesTex.Gemini Embedding 2 launched as Googles first fully multimodal embedding model:single embedding space for text, images, video, audio, docs8,192-token text inputs100+ languagesoutput dims 3072 / 1536 / 768 via MRLup to 6 images, 120s video, 6-page PDFs per request @OfficialLoganK @_philschmid @googleaidevs.Hugging Face Storage Buckets launched as S3-like mutable storage built on Xet deduplication, starting at 8/TB/month, positioned for checkpoints, logs, traces, eval outputs, and agent artifacts @victormustar @huggingface @Wauplin.Other notable model/system releases:RWKV-7 G1e in 13B/7B/3B/1B sizes @BlinkDL_AIHume TADA open-source TTS model: zero content hallucinations across 1,000+ test samples, 5x faster than comparable LLM-TTS, and 2,048 tokens 700s of audio @hume_aiPhi-4-reasoning-vision-15B highlighted as a compact open multimodal model @dl_weeklyBaseten/Harvard prefix-caching collaboration for inference efficiency @chutes_aiAutonomous research, AlphaGo lineage, and recursive improvementThe strongest meta-theme outside AMI was automated ML research:Karpathys autoresearch concept overnight experiment loops with code edits, short training runs, and metric-based keep/discard logic was widely discussed @NerdyRodent @_philschmidYuchen Jin ran a Claude-driven chief scientist loop for 11+ hours, 568 experiments, on 8 GPUs, observing a progression from broad exploration to focused refinement to heavy validation @Yuchenj_UWKarpathy hinted at AgentHub, GitHub for agents, as the next layer for multi-agent research collaboration @karpathy @Yuchenj_UWAlphaGos 10-year anniversary triggered many reflections:Demis Hassabis argued AlphaGos search-and-planning ideas remain central to AGI and science @demishassabisGoogle/DeepMind linked AlphaGo to AlphaEvolve and broader compute/science optimization @Google @GoogleDeepMindNoam Brown-style framing that current reasoning models follow the AlphaGo recipe: imitation, inference-time search, then RL @polynoamialRecursive self-improvement discourse remained active:Schmidhuber resurfaced his long-running meta-learning/RSI work @SchmidhuberAICommentary on unsupervised RLVR suggested naive recursive improvement currently hits ceilings @teortaxesTexCapability milestones, applications, and deploymentOne of the most striking capability claims: a possible AI-assisted resolution of a FrontierMath open problem, first from users claiming GPT-5.4 Pro solved it and later from observers noting this could be the first FrontierMath open problem solved by AI if validated @spicey_lemonade @kevinweil @GregHBurnham @AcerFur.Google reported a prospective clinical study of AMIE in urgent care workflows: blinded evaluation found similar differential-diagnosis and management-plan quality overall versus PCPs, but PCPs outperformed on practicality and cost effectiveness (p=0.003, p=0.004) @iScienceLuvr.Google Sheets with Gemini reached 70.48% on SpreadsheetBench, described as near human-expert ability @GoogleAI.Google Workspace/Gemini rollout expanded across Docs, Sheets, Slides, and Drive, with claims of Sheets tasks 9x faster, AI-generated slide layouts, and Drive-level cross-document answers @Google @sundarpichai.Microsoft reported health as the #1 topic for Copilot mobile users in 2025, based on analysis of 500k+ conversations @mustafasuleyman.Sharon Zhou claimed superhuman performance on AI kernel optimization in production settings, suggesting automatic GPU-porting/optimization may soon be practical @realSharonZhou.AI Reddit Recap/r/LocalLlama + /r/localLLM Recap Read more

[AINews] OpenAI closes $110B raise from Amazon, NVIDIA, SoftBank in largest startup fundraise in history @ $840B post-money

Swyx · reference · 75% similar

[AINews] AI Engineer will be the LAST job

Swyx · explanation · 71% similar

[AINews] WTF Happened in December 2025?

Swyx · explanation · 70% similar