Podcast thumbnail for VentureBeat

by VentureBeat

32 episodes
Updated Daily
Accepts GuestsHas Sponsors

Podcast Overview

AI gets real here. On “Beyond the Pilot,” top business execs share what actually happens after the AI proof of concept — from infrastructure and org design to wins, failures, and ROI. Not theory, but deep dives into how they scaled AI that works.

Language

🇺🇲

Publishing Since

9/10/2025

1 verified contact email on file for VentureBeat

Pitch yourself as a guest, propose sponsorships, or reach out directly to the host.

Recent Episodes

Episode thumbnail for Small Models, Massive Wins: The New Shopify AI Formula

June 24, 2026

Small Models, Massive Wins: The New Shopify AI Formula

Shopify's distillation pipeline cuts production AI costs by up to 30x — and in some cases, the smaller model outperforms the frontier model on the narrow task. That's not a trade-off. That's a win on accuracy, latency, and cost simultaneously. Farhan Thawar, VP & Head of Engineering at Shopify, runs AI across one of the largest commerce platforms on earth. In this episode, he breaks down the exact infrastructure decisions Shopify made to avoid being locked into any single model provider — and why 29% of enterprise AI projects die from token costs, not model failure. Shopify built an internal LLM proxy that routes tokens across every major provider, enabling automatic failover when any one goes down. On top of that, their Universal Distillation Platform (UDP) lets any R&D team distill a frontier model (Opus 4, GPT-5+) down to a fine-tuned open source model (Qwen and others) for a specific subtask — in roughly a day, with evals baked in. Results range from 2x to 30x cheaper, faster, and more accurate than calling the frontier API for everything. Shopify currently runs roughly half a dozen of these distilled models in production, with more being added. Farhan also details River, Shopify's internal agentic substrate — a public-only Slack agent that queries their data warehouse, reads their PM system, and improves its own answers when engineers jump in to correct it. Plus: how Shopify governs AI-generated code at scale, why they killed their token leaderboard, how UCP is positioning Shopify's catalog for agentic commerce, and what a two-to-three year horizon looks like when agents start holding spending budgets and buying autonomously. 🎙️ GUEST: Farhan Thawar | VP & Head of Engineering, Shopify 🎙️ HOST: Sam Witteveen | VentureBeat __ If you enjoy these conversations, you need to be in Menlo Park this July. VB Transform 2026 is VentureBeat's flagship enterprise AI event, built entirely around one question: How do you orchestrate AI autonomy at scale? July 14–15, Hotel Nia. Real projects, proprietary research, no fluff. 50% off for listeners with code BEYONDTHEPILOT: https://bit.ly/4fK4F6z — **CHAPTERS** 00:00 Intro — Infrastructure-first philosophy at Shopify 02:00 Episode overview: token economy, distillation, and the cost crisis killing AI projects 03:00 Toby's "AI reflexivity" mandate and what it actually means for engineers 05:20 Ecosystem strategy: when to build vs. leave room for third-party developers 06:20 Developer tooling stack: GitHub Copilot (2021), Claude Code, Cursor, Codex, and Shopify's own River 08:00 LLM proxy architecture: bulk token purchasing, multi-provider failover, and usage reporting 09:00 Model agnosticism: why Shopify lets engineers choose their own harness 10:00 AI adoption beyond R&D — finance, HR, sales, and the Qwik internal deployment platform 11:20 Code governance: who owns AI-generated code going to production 13:00 Token maxing, leaderboards, and the shift from AI reflexivity to AI leverage 14:20 Circuit breakers: how Shopify catches runaway token spend without hard limits 15:20 LLM proxy deep dive: uptime, insights, and cross-team learning from usage data 16:40 River: Shopify's agentic substrate, public-channel-only design, and emergent HITL behavior 18:40 Model distillation explained: teacher/student models, narrow tasks, and the trade-offs 21:00 Universal Distillation Platform (UDP): how any team submits a distillation job in ~one day 22:20 Tangle: open source pipeline visualization for distillation workflows 22:40 Who uses UDP today, and Farhan's vision for auto-selecting the distillation target model 24:00 Evals in practice: golden datasets, Toloka for data generation, threshold-based deployment 26:20 Sim Gym: simulating A/B tests for small merchants without enough traffic 27:40 Pulse: async AI insights on store performance and conversion 28:20 GPU infrastructure trade-offs: when running your own inference makes sense at scale 29:20 Frontier vs. distilled model split in production — and why dev tokens stay frontier 30:40 Should Shopify train its own coding model? Why Farhan says not yet 31:40 UCP protocol, agentic commerce, and how Shopify's catalog surfaces in every LLM 35:00 Early signals: agentic commerce growth rate and the shift away from SEO 36:00 What developers and entrepreneurs should build for now given multi-channel uncertainty 37:20 River as a hive-mind agent — and what "truly agentic" actually means 38:40 Two-to-three year forecast: agents with spending budgets, autonomous purchasing, proactive outreach 40:00 Project Glasswing, model access loss (Fable), and the case for multi-provider architecture 42:00 Why every company should have a backup plan — and how Shopify built theirs years ago 43:20 Wrap-up --- Subscribe to VentureBeat: https://www.youtube.com/@VentureBeat Apple Podcasts: https://podcasts.apple.com/us/podcast/venturebeat/id1839285239 Spotify: https://open.spotify.com/show/4Zti73yb4hmiTNa7pEYls4 Website: https://venturebeat.com LinkedIn: https://www.linkedin.com/company/venturebeat Newsletter: https://venturebeat.com/newsletters #EnterpriseAI #AIAgents #LLM #MLOps #AgenticAI — Learn more about your ad choices. Visit megaphone.fm/adchoices

Episode thumbnail for Weaponize Tokenmaxing: MassMutual’s ROI Engine

June 10, 2026

Weaponize Tokenmaxing: MassMutual’s ROI Engine

If you enjoy these conversations, you need to be in Menlo Park this July. VB Transform 2026 is VentureBeat's flagship enterprise AI event, built entirely around one question: How do you orchestrate AI autonomy at scale? July 14–15, Hotel Nia. Real projects, proprietary research, no fluff. 50% off for listeners with code BEYONDTHEPILOT: https://bit.ly/4fK4F6z _ He negotiated seat-based (unlimited) licenses before token costs exploded — and projects only a 20–30% spend increase when that changes. MassMutual's CIO rebuilt a COBOL mainframe app into a working web prototype in 7 days — work that used to take a 15-person SI team 90 days. That's not a pilot. That's a new build-vs-buy equation. Sears Merritt, CIO at MassMutual, runs AI inside one of America's most regulated legacy environments. In this conversation, he breaks down the architecture decisions, cost structures, and security posture behind real, production deployments — not roadmaps. On infrastructure: MassMutual routes all agentic tool calls through centralized API gateways with identity and access controls, using Amazon Bedrock as a proxy layer. That multi-harness design preserves model optionality while enforcing FinOps discipline. On model selection: a trust score rubric drives every model decision, balancing cost against user experience. In their IT contact center, that rubric led them to choose the more expensive model after users said the quality gap was worth two extra seconds of inference time. Productivity results are concrete: 30% boost in developer output across the SDLC, call resolution time dropping from 10 minutes to under 1 minute for specific call types, cost per interaction from dollars to cents. On the security side, MassMutual is embedding AI into its SDLC for vulnerability scanning and compressing cyber response cycles from days to hours — building agentic tier-one and tier-two capabilities to match the accelerated threat landscape that frontier models like Mythos have exposed. 🎙️ GUEST: Sears Merritt | Head of Enterprise Technology & Experience, MassMutual 🎙️ HOSTS: Matt Marshall | VentureBeat, Sam Witteveen | VentureBeat — If you enjoy these conversations, you need to be in Menlo Park this July. VB Transform 2026 is VentureBeat's flagship enterprise AI event, built entirely around one question: How do you orchestrate AI autonomy at scale? July 14–15, Hotel Nia. Real projects, proprietary research, no fluff. 50% off for listeners with code BEYONDTHEPILOT: https://bit.ly/4fK4F6z — 00:00 Intro & COBOL Modernization Preview 00:01:15 Guest Introduction: Sears Merritt, MassMutual CIO 00:02:00 Multi-Vendor Strategy & Avoiding Lock-In 00:02:30 How MassMutual Evaluates AI Tools (Cost vs. Experience Rubric) 00:03:15 12-Month Contracts and Switching Optionality 00:03:30 AI Standards Cycle: MCP, A2A, and the Early Internet Analogy 00:05:15 Measuring Developer Productivity: 30% SDLC Boost 00:06:15 Contact Center Results: 10 Minutes to 1 Minute, Dollars to Cents 00:07:00 Managing Token Cost Explosion 00:07:30 Seat-Based vs. Consumption Licensing Decision 00:08:45 Token Maxing While the All-You-Can-Eat Window Is Open 00:09:45 Building FinOps Infrastructure for Model Routing and Optimization 00:11:15 Outcome-First Model Selection: When to Pay for Opus vs. a Cheaper LLM 00:13:15 Trust Score Framework: How MassMutual Picks the Right Model 00:15:00 Sponsor: OutShift by Cisco 00:15:30 Claude, OpenAI Codex, and Multi-Harness Agentic Architecture 00:16:30 API Gateway Design: Identity, Access, and FinOps Controls 00:17:30 What the Usage Analytics Revealed (And What Merritt Was Afraid to Find) 00:18:15 Projected Token Cost Increase: 20–30% Off Unlimited Plan 00:19:45 Power Law Usage: Top 10% Consuming 80% of Tokens 00:20:45 COBOL Mainframe Modernization: The 7-Day Prototype Workflow 00:22:30 The Full AI-Assisted COBOL Migration Playbook 00:24:00 Implications for IBM and Mainframe-as-a-Service Providers 00:25:15 Open Source Models, DeepSeek, and the Cost Efficiency Question 00:27:45 Chinese Models in a Regulated Environment: Evaluation Criteria 00:29:30 Agentic Security: Identity Management and the Evolving Threat Landscape 00:30:00 How Frontier Models Changed the Threat Velocity (Not the Threat Types) 00:31:15 Fighting AI With AI: Agentic Tier-1 and Tier-2 Cyber Capabilities 00:31:30 Project Glasswing and the CISO Community Response 00:32:30 Embedding AI Into the SDLC for Security Scanning 00:34:00 Closing: When Will Agentic Standards Consolidate? Advice for Builders --- Subscribe to VentureBeat: https://www.youtube.com/@VentureBeat Apple Podcasts: https://podcasts.apple.com/us/podcast/venturebeat/id1839285239 Spotify: https://open.spotify.com/show/4Zti73yb4hmiTNa7pEYls4 Website: https://venturebeat.com LinkedIn: https://www.linkedin.com/company/venturebeat Newsletter: https://venturebeat.com/newsletters #EnterpriseAI #AIAgents #LLMInfrastructure #AIDeployment #MLOps Learn more about your ad choices. Visit megaphone.fm/adchoices

Episode thumbnail for Building a 30% Better AI: The Taste Graph Moat

May 27, 2026

Building a 30% Better AI: The Taste Graph Moat

Pinterest's open-source AI stack costs 90% less than frontier models — and their custom-trained recommender outperforms off-the-shelf alternatives by 30% in accuracy. Pinterest CTO Matt Madrigal breaks down exactly how they did it, and what enterprise AI teams can actually replicate. Madrigal walks through the full architecture behind Navigator 1, Pinterest's conversational shopping assistant built on Qwen 3 VL — and the specific decision to rip out its native vision encoder and replace it with PinCLIP, Pinterest's proprietary multimodal embedding layer. That swap alone closes a 20x inference latency gap and makes the economics work at 620 million monthly active users. This is the clearest public explanation yet of how a scaled platform operationalizes the "core vs. context" principle for model selection: open-source and custom-built where it touches the user, frontier models where speed-to-prototype matters more than cost. The conversation also covers the Taste Graph — Pinterest's knowledge graph across hundreds of billions of pins and 15 billion boards — and how post-training on that proprietary data lets a smaller, fit-for-purpose model beat a larger frontier model on production metrics. Madrigal details their eval framework: gold set benchmarks, product-level evals tied to engagement and merchant click outcomes, and a structured A/B test pipeline that runs from engineer PRs through to live user signal. On the organizational side: how Pinterest manages a "default yes" multi-IDE policy (Cursor, Windsurf, Claude Code, Codex) without collapsing security posture, how they segment sandbox environments between ML engineers with Taste Graph access and general application developers, and why Madrigal measures AI coding ROI in token usage and experimentation velocity — not lines of code. 🎙️ GUEST: Matt Madrigal | CTO, Pinterest 🎙️ HOSTS: Matt Marshall | VentureBeat, Sam Witteveen | VentureBeat 00:00 Show Intro and Guest 01:17 Open Source Cost Breakdown 02:20 Pinterest Multimodal Roots 02:37 PinClip and Embeddings 05:46 Core vs Context Models 07:43 Navigator 1 Assistant Stack 11:52 Benchmarking and Evals 13:29 Accuracy from Proprietary Data 17:16 Taste Graph Explained 18:29 Taste Graph in Training 22:22 Fighting AI Slop 25:16 Developer Tools and Velocity 27:57 Tool Choice and Governance 28:56 Security Sandboxes and CICD 30:57 Wrap Up If you enjoy these conversations, you need to be in Menlo Park this July. VB Transform 2026 is VentureBeat's flagship enterprise AI event, built entirely around one question: How do you orchestrate AI autonomy at scale? July 14–15, Hotel Nia. Real projects, proprietary research, no fluff. 50% off for listeners with code BEYONDTHEPILOT: https://bit.ly/4fK4F6z Subscribe to VentureBeat: https://www.youtube.com/@VentureBeat Apple Podcasts: https://podcasts.apple.com/us/podcast/venturebeat/id1839285239 Spotify: https://open.spotify.com/show/4Zti73yb4hmiTNa7pEYls4 Website: https://venturebeat.com LinkedIn: https://www.linkedin.com/company/venturebeat Newsletter: https://venturebeat.com/newsletters #EnterpriseAI #OpenSourceAI #AIInfrastructure #LLM #MachineLearning Learn more about your ad choices. Visit megaphone.fm/adchoices

32 total episodes available

Deep-dive analytics for VentureBeat

Frequently asked questions

Have a different question and can't find the answer you're looking for? Reach out to our support team by sending us an email and we'll get back to you as soon as we can.

What is VentureBeat?

AI gets real here. On “Beyond the Pilot,” top business execs share what actually happens after the AI proof of concept — from infrastructure and org design to wins, failures, and ROI. Not theory, but deep dives into how they scaled AI that works.

How often does this podcast release new episodes?

This podcast updates daily.

Where can I listen to this podcast?

This podcast is available on 4 platforms including Apple Podcasts, Spotify, and more. You can also use the RSS feed directly.

Does this podcast accept guests?

Yes, this podcast regularly features guests.

Legal Disclaimer

Pod Engine is not affiliated with, endorsed by, or officially connected with any of the podcasts displayed on this platform. We operate independently as a podcast discovery and analytics service.

All podcast artwork, thumbnails, and content displayed on this page are the property of their respective owners and are protected by applicable copyright laws. This includes, but is not limited to, podcast cover art, episode artwork, show descriptions, episode titles, transcripts, audio snippets, and any other content originating from the podcast creators or their licensors.

We display this content under fair use principles and/or implied license for the purpose of podcast discovery, information, and commentary. We make no claim of ownership over any podcast content, artwork, or related materials shown on this platform. All trademarks, service marks, and trade names are the property of their respective owners.

While we strive to ensure all content usage is properly authorized, if you are a rights holder and believe your content is being used inappropriately or without proper authorization, please contact us immediately at hey@podengine.ai for prompt review and appropriate action, which may include content removal or proper attribution.

By accessing and using this platform, you acknowledge and agree to respect all applicable copyright laws and intellectual property rights of content owners. Any unauthorized reproduction, distribution, or commercial use of the content displayed on this platform is strictly prohibited.