Close Menu
    X (Twitter) LinkedIn
    CapitalAI DailyCapitalAI Daily
    X (Twitter) LinkedIn
    • Markets & Investments
    • Big Tech & AI
    • AI & Cybercrime
    • Jobs & AI
    • Banks
    • Crypto
    Sunday, December 28
    CapitalAI DailyCapitalAI Daily
    Home»Big Tech & AI»Nvidia Primed To Control Next Phase of AI Inference After Groq Deal, According to Investor Gavin Baker

    Nvidia Primed To Control Next Phase of AI Inference After Groq Deal, According to Investor Gavin Baker

    By Henry KanapiDecember 28, 20252 Mins Read
    Share
    Twitter LinkedIn

    The CIO of Atreides Management believes the AI race is shifting away from training models and toward how fast, cheaply, and reliably those models can run in real products.

    In a new post on X, Gavin Baker says Nvidia’s $20 billion Groq deal is less about acquiring talent and more about locking up the economics of AI inference, the process of running a trained AI model to make predictions on new, unseen data.

    Baker says inference is breaking into two distinct phases: prefill, where a model processes a prompt or context, and decode, where it generates tokens in real time for users.

    Baker explains that Nvidia’s roadmap now covers both ends of that process. He says that the upcoming Rubin CPX chips are optimized for prefill, using large amounts of memory to handle massive context windows, while Groq’s SRAM architecture is designed for real-time reasoning. According to Baker, the design sacrifices capacity but delivers ultra-low latency, making it ideal for applications where delays break the user experience, such as voice assistants, live translation or agentic AI workflows.

    “The Groq-derived ‘Rubin SRAM’ is optimized for ultra-low latency agentic reasoning inference workloads as a result of SRAM’s extremely high memory bandwidth at the cost of lower memory capacity. In the latter case, either CPX or the normal Rubin will likely be used for prefill.”

    He adds that it has long been known that SRAM systems can produce far higher tokens-per-second than GPUs or most ASICs. But until recently, it was unclear whether customers would pay more per token for that speed.

    “It is now abundantly clear from Cerebras and Groq’s recent results that users are willing to pay for speed.”

    According to Baker, the market’s response dramatically strengthens Nvidia’s position. With multiple Rubin variants and tightly integrated networking, the investor says it has now become increasingly difficult for competing custom chips to justify their existence.

    “Increases my confidence that all ASICs except TPU, AI5 and Trainium will eventually be canceled. Good luck competing with the three Rubin variants and multiple associated networking chips.”

    Disclaimer: Opinions expressed at CapitalAI Daily are not investment advice. Investors should do their own due diligence before making any decisions involving securities, cryptocurrencies, or digital assets. Your transfers and trades are at your own risk, and any losses you may incur are your responsibility. CapitalAI Daily does not recommend the buying or selling of any assets, nor is CapitalAI Daily an investment advisor. See our Editorial Standards and Terms of Use.

    AI Groq Inference News Nvidia
    Previous ArticleMacro Strategist Luke Gromen Says Gold and Silver Must Run for America To Rebuild

    Read More

    Macro Strategist Luke Gromen Says Gold and Silver Must Run for America To Rebuild

    December 28, 2025

    Legendary Investor Bill Gurley Warns Workers the Only Real Defense Against AI Is Becoming Fully AI-Enabled

    December 27, 2025

    Mark Cuban Says Small Businesses Are Losing ‘Tens of Billions’ Each Year and College Graduates Have a Real Opportunity With AI

    December 27, 2025

    AI Intelligence Pricing Collapses From $37.50 to Pennies in a ‘Wicked’ Race to the Bottom, Says Marc Benioff

    December 27, 2025

    Robert Kiyosaki Sees Potential 185% Rally for Precious Metal in 2026 As AI and Data Centers Drive New Demand

    December 27, 2025

    Dan Ives Says ‘Under-the-Radar’ Stock Could Rally 45% With Cybersecurity Entering Golden Age in 2026

    December 27, 2025
    X (Twitter) LinkedIn
    • About
    • Author
    • Editorial Standards
    • Contact Us
    • Privacy Policy
    • Terms of Service
    • Cookie Policy
    © 2025 CapitalAI Daily. All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.