Powered by RND
PodcastsEconomía y empresaLatent Space: The AI Engineer Podcast

Latent Space: The AI Engineer Podcast

swyx + Alessio
Latent Space: The AI Engineer Podcast
Último episodio

Episodios disponibles

5 de 158
  • ⚡️ Ship AI recap: Agents, Workflows, and Python — w/ Vercel CTO Malte Ubl
    In this conversation with Malte Ubl, CTO of Vercel (http://x.com/cramforce), we explore how the company is pioneering the infrastructure for AI-powered development through their comprehensive suite of tools including workflows, AI SDK, and the newly announced agent ecosystem. Malte shares insights into Vercel's philosophy of "dogfooding" - never shipping abstractions they haven't battle-tested themselves - which led to extracting their AI SDK from v0 and building production agents that handle everything from anomaly detection to lead qualification. The discussion dives deep into Vercel's new Workflow Development Kit, which brings durable execution patterns to serverless functions, allowing developers to write code that can pause, resume, and wait indefinitely without cost. Malte explains how this enables complex agent orchestration with human-in-the-loop approvals through simple webhook patterns, making it dramatically easier to build reliable AI applications. We explore Vercel's strategic approach to AI agents, including their DevOps agent that automatically investigates production anomalies by querying observability data and analyzing logs - solving the recall-precision problem that plagues traditional alerting systems. Malte candidly discusses where agents excel today (meeting notes, UI changes, lead qualification) versus where they fall short, emphasizing the importance of finding the "sweet spot" by asking employees what they hate most about their jobs. The conversation also covers Vercel's significant investment in Python support, bringing zero-config deployment to Flask and FastAPI applications, and their vision for security in an AI-coded world where developers "cannot be trusted." Malte shares his perspective on how CTOs must transform their companies for the AI era while staying true to their core competencies, and why maintaining strong IC (individual contributor) career paths is crucial as AI changes the nature of software development. What was launched at Ship AI 2025: AI SDK 6.0 & Agent Architecture Agent Abstraction Philosophy: AI SDK 6 introduces an agent abstraction where you can "define once, deploy everywhere". How does this differ from existing agent frameworks like LangChain or AutoGPT? What specific pain points did you observe in production that led to this design? Human-in-the-Loop at Scale: The tool approval system with needsApproval: true gates actions until human confirmation. How do you envision this working at scale for companies with thousands of agent executions? What's the queue management and escalation strategy? Type Safety Across Models: AI SDK 6 promises "end-to-end type safety across models and UI". Given that different LLMs have varying capabilities and output formats, how do you maintain type guarantees when swapping between providers like OpenAI, Anthropic, or Mistral? Workflow Development Kit (WDK) Durability as Code: The use workflow primitive makes any TypeScript function durable with automatic retries, progress persistence, and observability. What's happening under the hood? Are you using event sourcing, checkpoint/restart, or a different pattern? Infrastructure Provisioning: Vercel automatically detects when a function is durable and dynamically provisions infrastructure in real-time. What signals are you detecting in the code, and how do you determine the optimal infrastructure configuration (queue sizes, retry policies, timeout values)? Vercel Agent (beta) Code Review Validation: The Agent reviews code and proposes "validated patches". What does "validated" mean in this context? Are you running automated tests, static analysis, or something more sophisticated? AI Investigations: Vercel Agent automatically opens AI investigations when it detects performance or error spikes using real production data. What data sources does it have access to? How does it distinguish between normal variance and actual anomalies? Python Support (For the first time, Vercel now supports Python backends natively.) Marketplace & Agent Ecosystem Agent Network Effects: The Marketplace now offers agents like CodeRabbit, Corridor, Sourcery, and integrations with Autonoma, Braintrust, Browser Use. How do you ensure these third-party agents can't access sensitive customer data? What's the security model? "An Agent on Every Desk" Program Vercel launched a new program to help companies identify high-value use cases and build their first production AI agents. It provides consultations, reference templates, and hands-on support to go from idea to deployed agent
    --------  
  • ⚡️ Ship AI recap: Agents, Workflows, and Python — w/ Vercel CTO Malte Ubl
    In this conversation with Malte Ubl, CTO of Vercel (http://x.com/cramforce), we explore how the company is pioneering the infrastructure for AI-powered development through their comprehensive suite of tools including workflows, AI SDK, and the newly announced agent ecosystem. Malte shares insights into Vercel's philosophy of "dogfooding" - never shipping abstractions they haven't battle-tested themselves - which led to extracting their AI SDK from v0 and building production agents that handle everything from anomaly detection to lead qualification. The discussion dives deep into Vercel's new Workflow Development Kit, which brings durable execution patterns to serverless functions, allowing developers to write code that can pause, resume, and wait indefinitely without cost. Malte explains how this enables complex agent orchestration with human-in-the-loop approvals through simple webhook patterns, making it dramatically easier to build reliable AI applications. We explore Vercel's strategic approach to AI agents, including their DevOps agent that automatically investigates production anomalies by querying observability data and analyzing logs - solving the recall-precision problem that plagues traditional alerting systems. Malte candidly discusses where agents excel today (meeting notes, UI changes, lead qualification) versus where they fall short, emphasizing the importance of finding the "sweet spot" by asking employees what they hate most about their jobs. The conversation also covers Vercel's significant investment in Python support, bringing zero-config deployment to Flask and FastAPI applications, and their vision for security in an AI-coded world where developers "cannot be trusted." Malte shares his perspective on how CTOs must transform their companies for the AI era while staying true to their core competencies, and why maintaining strong IC (individual contributor) career paths is crucial as AI changes the nature of software development. What was launched at Ship AI 2025: AI SDK 6.0 & Agent Architecture Agent Abstraction Philosophy: AI SDK 6 introduces an agent abstraction where you can "define once, deploy everywhere". How does this differ from existing agent frameworks like LangChain or AutoGPT? What specific pain points did you observe in production that led to this design? Human-in-the-Loop at Scale: The tool approval system with needsApproval: true gates actions until human confirmation. How do you envision this working at scale for companies with thousands of agent executions? What's the queue management and escalation strategy? Type Safety Across Models: AI SDK 6 promises "end-to-end type safety across models and UI". Given that different LLMs have varying capabilities and output formats, how do you maintain type guarantees when swapping between providers like OpenAI, Anthropic, or Mistral? Workflow Development Kit (WDK) Durability as Code: The use workflow primitive makes any TypeScript function durable with automatic retries, progress persistence, and observability. What's happening under the hood? Are you using event sourcing, checkpoint/restart, or a different pattern? Infrastructure Provisioning: Vercel automatically detects when a function is durable and dynamically provisions infrastructure in real-time. What signals are you detecting in the code, and how do you determine the optimal infrastructure configuration (queue sizes, retry policies, timeout values)? Vercel Agent (beta) Code Review Validation: The Agent reviews code and proposes "validated patches". What does "validated" mean in this context? Are you running automated tests, static analysis, or something more sophisticated? AI Investigations: Vercel Agent automatically opens AI investigations when it detects performance or error spikes using real production data. What data sources does it have access to? How does it distinguish between normal variance and actual anomalies? Python Support (For the first time, Vercel now supports Python backends natively.) Marketplace & Agent Ecosystem Agent Network Effects: The Marketplace now offers agents like CodeRabbit, Corridor, Sourcery, and integrations with Autonoma, Braintrust, Browser Use. How do you ensure these third-party agents can't access sensitive customer data? What's the security model? "An Agent on Every Desk" Program Vercel launched a new program to help companies identify high-value use cases and build their first production AI agents. It provides consultations, reference templates, and hands-on support to go from idea to deployed agent Two open-source agent templates were shared: a Lead Qualification Agent (built with Next.js, Vercel AI SDK, Workflows, Slack) that scrapes lead data and prioritizes prospects, and a Data Analyst Agent that links Slack to SQL for natural-language data queries. By seeding these templates and guides, Vercel is strategically lowering the barrier for organizations to adopt agents internally.
    --------  
  • The Agents Economy Backbone - with Emily Glassberg Sands, Head of Data & AI at Stripe
    Emily Glassberg Sands is the Head of Data & AI at Stripe where she leads the organization’s efforts to build financial infrastructure for the internet & leverage AI to power Stripe’s products. Stripe processes about $1.4 trillion in payments annually (~1.3% of global GDP), making it an exciting opportunity to apply AI & ML at scale. In this episode, Emily shares insights into how Stripe is using AI to solve complex problems like fraud detection, optimizing checkout experiences, & enabling new business models for AI companies. Emily also shares her economist perspective on market efficiency & how Stripe’s focus on building economic infrastructure for AI is driving growth across the ecosystem. We discuss: Stripe’s domain-specific foundation model and “payments embeddings” that run inline on the charge path to detect sophisticated card-testing at scale (improved detection rates at large users from ~59% to ~97%). The launch of the Agentic Commerce Protocol (ACP) with OpenAI, creating a shared standard for how businesses can expose products to AI agents which is used by Walmart and Sam’s Club. How Stripe is helping AI companies manage new fraud vectors, such as free trial and refund abuse, and the importance of real-time, outcome-based billing The impact of AI on Stripe’s internal operations, including the use of LLMs for code generation, merchant understanding, and internal tooling Why many AI companies are going global day-one how Stripe’s Link network (200M+ consumers) concentrates AI demand. Whether we're in an AI bubble, why GDP hasn't reflected AI productivity gains yet, and how agentic commerce could expand consumption by removing time constraints for high-income consumers Emily’s perspective on the changing social contract around AI, the importance of deep thinking, and the role of brand and design in AI-driven products — Where to find Emily Sands X: https://x.com/emilygsands LinkedIn: https://www.linkedin.com/in/egsands/ Where to find Shawn Wang X: https://x.com/swyx LinkedIn: https://www.linkedin.com/in/shawnswyxwang/ Where to find Alessio Fanelli X: https://x.com/FanaHOVA LinkedIn: https://www.linkedin.com/in/fanahova/ Where to find Latent Space X: https://x.com/latentspacepod Substack: https://www.latent.space/ Chapters 00:00:00 Introduction and Emily's Role at Stripe 00:09:55 AI Business Models and Fraud Challenges 00:13:49 Extending Radar for AI Economy 00:16:42 Payment Innovation: Token Billing and Stablecoins 00:23:09 Agentic Commerce Protocol Launch 00:29:40 Good Bots vs Bad Bots in AI 00:40:31 Designing the Agents Commerce Protocol 00:49:32 Internal AI Adoption at Stripe 01:04:53 Data Discovery and Text-to-SQL Challenges 01:21:00 AI Economy Analysis: Bubble or Boom?
    --------  
  • Why RL Won — Kyle Corbitt, OpenPipe (acq. CoreWeave)
    In this deep dive with Kyle Corbitt, co-founder and CEO of OpenPipe (recently acquired by CoreWeave), we explore the evolution of fine-tuning in the age of AI agents and the critical shift from supervised fine-tuning to reinforcement learning. Kyle shares his journey from leading YC's Startup School to building OpenPipe, initially focused on distilling expensive GPT-4 workflows into smaller, cheaper models before pivoting to RL-based agent training as frontier model prices plummeted. The conversation reveals why 90% of AI projects remain stuck in proof-of-concept purgatory - not due to capability limitations, but reliability issues that Kyle believes can be solved through continuous learning from real-world experience. He discusses the breakthrough of RULER (Relative Universal Reinforcement Learning Elicited Rewards), which uses LLMs as judges to rank agent behaviors relatively rather than absolutely, making RL training accessible without complex reward engineering. Kyle candidly assesses the challenges of building realistic training environments for agents, explaining why GRPO (despite its advantages) may be a dead end due to its requirement for perfectly reproducible parallel rollouts. He shares insights on why LoRAs remain underrated for production deployments, why GEPA and prompt optimization haven't lived up to the hype in his testing, and why the hardest part of deploying agents isn't the AI - it's sandboxing real-world systems with all their bugs and edge cases intact. The discussion also covers OpenPipe's acquisition by CoreWeave, the launch of their serverless reinforcement learning platform, and Kyle's vision for a future where every deployed agent continuously learns from production experience. He predicts that solving the reliability problem through continuous RL could unlock 10x more AI inference demand from projects currently stuck in development, fundamentally changing how we think about agent deployment and maintenance. Key Topics: The rise and fall of fine-tuning as a business model Why 90% of AI projects never reach production RULER: Making RL accessible through relative ranking The environment problem: Why sandboxing is harder than training GRPO vs PPO and the future of RL algorithms LoRAs: The underrated deployment optimization Why GEPA and prompt optimization disappointed in practice Building world models as synthetic training environments The $500B Stargate bet and OpenAI's potential crypto play Continuous learning as the path to reliable agents References https://www.linkedin.com/in/kcorbitt/ Aug 2023  https://openpipe.ai/blog/from-prompts-to-models  DEC 2023 https://openpipe.ai/blog/mistral-7b-fine-tune-optimized JAN 2024 https://openpipe.ai/blog/s-lora MAY 2024 https://openpipe.ai/blog/the-ten-commandments-of-fine-tuning-in-prod   https://www.youtube.com/watch?v=-hYqt8M9u_M Oct 2024 https://openpipe.ai/blog/announcing-dpo-support  AIE NYC 2025 Finetuning 500m agents https://www.youtube.com/watch?v=zM9RYqCcioM&t=919s AIEWF 2025 How to train your agent (ART-E) https://www.youtube.com/watch?v=gEDl9C8s_-4&t=216s SEPT 2025 ACQUISTION https://openpipe.ai/blog/openpipe-coreweave  W&B Serverless RL https://openpipe.ai/blog/serverless-rl?refresh=1760042248153
    --------  
  • DevDay 2025: Apps SDK, Agent Kit, MCP, Codex and why Prompting is More Important than Ever
    At OpenAI DevDay, we sit down with Sherwin Wu and Christina Cai from the OpenAI Platform Team to discuss the launch of AgentKit - a comprehensive suite of tools for building, deploying, and optimizing AI agents. Christina walks us through the live demo she performed on stage, building a customer support agent in just 8 minutes using the visual Agent Builder, while Sherwin shares insights on how OpenAI is inverting the traditional website-chatbot paradigm by embedding apps directly within ChatGPT through the new Apps SDK. The conversation explores how OpenAI is tackling the challenges developers face when taking agents to production - from writing and optimizing prompts to building evaluation pipelines. They discuss the decision to adopt Anthropic's MCP protocol for tool connectivity, the importance of visual workflows for complex agent systems, and how features like human-in-the-loop approvals and automated prompt optimization are making agent development more accessible to a broader range of developers. Sherwin and Christina also reveal how OpenAI is dogfooding these tools internally, with their own customer support at openai.com already powered by AgentKit, and share candid insights about the evolution from plugins to GPTs to this new agent platform. They discuss the surprising persistence of prompting as a critical skill (contrary to predictions from two years ago), the challenges of serving custom fine-tuned models at scale, and why they believe visual agent builders are essential as workflows grow to span dozens of nodes. Guests: Sherwin Wu: Head of Engineering, OpenAI Platform https://www.linkedin.com/in/sherwinwu1/ https://x.com/sherwinwu?lang=en Christina Huang: Platform Experience, OpenAI https://x.com/christinaahuang https://www.linkedin.com/in/christinaahuang/ Thanks very much to Lindsay and Shaokyi for helping us set up this great deepdive into the new DevDay launches! Key Topics: • AgentKit launch: Agent SDK, Builder, Evals, and deployment tools • Apps SDK and the inversion of the app-chatbot paradigm • Adopting MCP protocol for universal tool connectivity • Visual agent building vs code-first approaches • Human-in-the-loop workflows and approval systems • Automated prompt optimization and "zero-gradient fine-tuning" • Service Health Dashboard and achieving five nines reliability • ChatKit as an embeddable, evergreen chat interface • The evolution from plugins to GPTs to agent platforms • Internal dogfooding with Codex and agent-powered support
    --------  

Más podcasts de Economía y empresa

Acerca de Latent Space: The AI Engineer Podcast

The podcast by and for AI Engineers! In 2024, over 2 million readers and listeners came to Latent Space to hear about news, papers and interviews in Software 3.0. We cover Foundation Models changing every domain in Code Generation, Multimodality, AI Agents, GPU Infra and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from OpenAI, Anthropic, Gemini, Meta (Soumith Chintala), Sierra (Bret Taylor), tiny (George Hotz), Databricks/MosaicML (Jon Frankle), Modular (Chris Lattner), Answer.ai (Jeremy Howard), et al. Full show notes always on https://latent.space
Sitio web del podcast

Escucha Latent Space: The AI Engineer Podcast, Whitepaper y muchos más podcasts de todo el mundo con la aplicación de radio.net

Descarga la app gratuita: radio.net

  • Añadir radios y podcasts a favoritos
  • Transmisión por Wi-Fi y Bluetooth
  • Carplay & Android Auto compatible
  • Muchas otras funciones de la app
Aplicaciones
Redes sociales
v7.23.11 | © 2007-2025 radio.de GmbH
Generated: 11/1/2025 - 7:08:29 AM