Close Menu
    X (Twitter) LinkedIn
    CapitalAI DailyCapitalAI Daily
    X (Twitter) LinkedIn
    • Markets & Investments
    • Big Tech & AI
    • AI & Cybercrime
    • Jobs & AI
    • Banks
    • Crypto
    Wednesday, April 1
    CapitalAI DailyCapitalAI Daily
    Home»Big Tech & AI»Nvidia Primed To Control Next Phase of AI Inference After Groq Deal, According to Investor Gavin Baker

    Nvidia Primed To Control Next Phase of AI Inference After Groq Deal, According to Investor Gavin Baker

    By Henry KanapiDecember 28, 20252 Mins Read
    Share
    Twitter LinkedIn

    The CIO of Atreides Management believes the AI race is shifting away from training models and toward how fast, cheaply, and reliably those models can run in real products.

    In a new post on X, Gavin Baker says Nvidia’s $20 billion Groq deal is less about acquiring talent and more about locking up the economics of AI inference, the process of running a trained AI model to make predictions on new, unseen data.

    Baker says inference is breaking into two distinct phases: prefill, where a model processes a prompt or context, and decode, where it generates tokens in real time for users.

    Baker explains that Nvidia’s roadmap now covers both ends of that process. He says that the upcoming Rubin CPX chips are optimized for prefill, using large amounts of memory to handle massive context windows, while Groq’s SRAM architecture is designed for real-time reasoning. According to Baker, the design sacrifices capacity but delivers ultra-low latency, making it ideal for applications where delays break the user experience, such as voice assistants, live translation or agentic AI workflows.

    “The Groq-derived ‘Rubin SRAM’ is optimized for ultra-low latency agentic reasoning inference workloads as a result of SRAM’s extremely high memory bandwidth at the cost of lower memory capacity. In the latter case, either CPX or the normal Rubin will likely be used for prefill.”

    He adds that it has long been known that SRAM systems can produce far higher tokens-per-second than GPUs or most ASICs. But until recently, it was unclear whether customers would pay more per token for that speed.

    “It is now abundantly clear from Cerebras and Groq’s recent results that users are willing to pay for speed.”

    According to Baker, the market’s response dramatically strengthens Nvidia’s position. With multiple Rubin variants and tightly integrated networking, the investor says it has now become increasingly difficult for competing custom chips to justify their existence.

    “Increases my confidence that all ASICs except TPU, AI5 and Trainium will eventually be canceled. Good luck competing with the three Rubin variants and multiple associated networking chips.”

    Disclaimer: Opinions expressed at CapitalAI Daily are not investment advice. Investors should do their own due diligence before making any decisions involving securities, cryptocurrencies, or digital assets. Your transfers and trades are at your own risk, and any losses you may incur are your responsibility. CapitalAI Daily does not recommend the buying or selling of any assets, nor is CapitalAI Daily an investment advisor. See our Editorial Standards and Terms of Use.

    AI Groq Inference News Nvidia
    Previous ArticleMacro Strategist Luke Gromen Says Gold and Silver Must Run for America To Rebuild
    Next Article Wall Street Veteran Says Tesla (TSLA) Could Double ‘And Then Some’ Amid Convergence of Five Plays – Here’s His Timeframe

    Read More

    Salesforce CEO Marc Benioff Says Microsoft Blocked OpenAI Investment, Drove $330,000,000 Bet on Anthropic

    April 1, 2026

    Oracle Slashing Thousands of Jobs As Stock Crashes 58% and AI Spending Pressures Mount: Report

    April 1, 2026

    OpenAI Officially Closes Latest Funding Round at $852,000,000,000 Valuation Led by Amazon, Nvidia and SoftBank

    April 1, 2026

    Majority of Americans Trust AI for Fraud Detection – But Not in Charge of Their Money: TD Bank Poll

    April 1, 2026

    Jamie Dimon Says AI Will Create Huge Benefits for Society in the Long Run – But There’s a Big Catch

    April 1, 2026

    Wells Fargo CEO Charles Scharf Says AI Lending Could Surge to $5,000,000,000,000 – But Business Model Uncertainty Lingers

    April 1, 2026
    X (Twitter) LinkedIn
    • About
    • Author
    • Editorial Standards
    • Contact Us
    • Privacy Policy
    • Terms of Service
    • Cookie Policy
    © 2025 CapitalAI Daily. All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.