Latent Space: The AI Engineer Podcast

Episodios disponibles

5 de 143

AI is Eating Search
ChatGPT handles 2.5B prompts/day and is on track to match Google's daily searches by end of 2026. AI agents don't browse like us—they crave queryable, chunkable data for tools like ChatGPT & Perplexity. A new industry is being born, some are calling it AI SEO, others GEO, but what is clear is that it drives amazing results. Businesses are seeing 2-4x higher conversion from visitors coming from AI compared to traditional search. Robert McCloy is the co-founder of Scrunch AI (https://scrunchai.com/), a fast growing company that helps brands and businesses re-write their content on the fly based on what agents are looking for. Chapters 00:00:00 Intro & Guest Introduction 00:01:30 The Genesis of Scrunch AI & AI Search Impact 00:06:02 AI Search Engines vs. Traditional SEO 00:06:28 Monitoring Prompts & The AI Search Stack 00:08:26 AI Training Data, Crawlers, and Content Strategy 00:12:33 AI Browsers and the Future of Web Consumption 00:16:06 Technical Mechanisms of AI Search & SEO Relevance 00:28:44 Personalization, Agent Experience, and Customer Journeys 00:30:44 Prompt Clusters, User Intent, and B2B Buying Patterns 00:36:06 Optimization Tactics: Prompt Injection, Content, and Pitfalls 00:40:37 Technical Content Delivery: JavaScript, Programmatic SEO, and LMS.txt 00:47:31 Case Studies & Conversion Optimization 00:51:36 Market Share & Platform Trends in AI Search 00:55:10 Wrap-Up & Future of AI-Driven Web
--------
56:21
--------
56:21
The Future of Notebooks - with Akshay Agrawal of Marimo
Akshay Agrawal joins us to talk about Marimo and their vision for the future of Python notebooks, and how it’s the perfect canvas for AI-driven data analysis. 0:00 Introduction 0:46 Overview of Marimo and Its Features 2:33 Origin Story and Motivation Behind Marimo 4:26 Demo: Classical Machine Learning with MNIST in Marimo 6:52 Notebook Compatibility and Conversion from Jupyter 7:42 Demo: Interactive Notebook with Custom UI and Layout 10:08 AI-Native Utilities and Code Generation with Language Models 11:36 Dependency Management and Integration with UV Package Manager 13:00 Demo: Data Annotation Workflow Using a PS5 Controller 15:51 Starting from Scratch: Blank Canvas AI Use Cases 18:27 Context Formatting for AI Code Generation 19:54 Chat Interface and Local/Remote Model Support 21:01 WebAssembly Support and MoLab Cloud-Hosted Notebooks 23:21 Future Plans and Breaking Out of Old Notebook Habits 25:40 Running Marimo Notebooks as Scripts or Data Apps 26:44 Exploring AI Agents and Community Contributions 26:56 Call to Action: How to Get Started and Contribute
--------
--------
Cline: the open source coding agent that doesn't cut costs
Saoud Rizwan and Pash from Cline joined us to talk about why fast apply models got bitter lesson'd, how they pioneered the plan + act paradigm for coding, and why non-technical people use IDEs to do marketing and generate slides. Full writeup: https://www.latent.space/p/cline X: https://x.com/latentspacepod Chapters: 00:00 - Introductions 01:35 - Plan and Act Paradigm 05:37 - Model Evaluation and Early Development of Cline 08:14 - Use Cases of Cline Beyond Coding 09:09 - Why Cline is a VS Code Extension and Not a Fork 12:07 - Economic Value of Programming Agents 16:07 - Early Adoption for MCPs 19:35 - Local vs Remote MCP Servers 22:10 - Anthropic's Role in MCP Registry 22:49 - Most Popular MCPs and Their Use Cases 25:26 - Challenges and Future of MCP Monetization 27:32 - Security and Trust Issues with MCPs 28:56 - Alternative History Without MCP 29:43 - Market Positioning of Coding Agents and IDE Integration Matrix 32:57 - Visibility and Autonomy in Coding Agents 35:21 - Evolving Definition of Complexity in Programming Tasks 38:16 - Forks of Cline and Open Source Regrets 40:07 - Simplicity vs Complexity in Agent Design 46:33 - How Fast Apply Got Bitter Lesson'd 49:12 - Cline's Business Model and Bring-Your-Own-API-Key Approach 54:18 - Integration with OpenRouter and Enterprise Infrastructure 55:32 - Impact of Declining Model Costs 57:48 - Background Agents and Multi-Agent Systems 1:00:42 - Vision and Multi-Modalities 1:01:07 - State of Context Engineering 1:07:37 - Memory Systems in Coding Agents 1:10:14 - Standardizing Rules Files Across Agent Tools 1:11:16 - Cline's Personality and Anthropomorphization 1:12:55 - Hiring at Cline and Team Culture Chapters 00:00:00 Introduction and Guest Intros 00:00:29 What is Klein? Product Overview 00:01:42 Plan and Act Paradigm 00:05:22 Model Evolution and Building Klein 00:07:40 Beyond Coding: Klein as a General Agent 00:09:12 Why Focus on VS Code Extension? 00:11:26 The Future of Programming and Agentic Paradigm 00:12:34 Economic Value: Programming vs. Other Use Cases 00:16:04 MCP Ecosystem: Growth and Marketplace 00:21:30 Security, Discoverability, and Trust in MCPs 00:22:55 Popular MCPs and Workflow Automation 00:25:30 Monetization and Payments for MCPs 00:37:53 Competition, Forks, and Open Source Philosophy 00:40:39 RAG, Fast Apply, and Agentic Simplicity 00:50:11 Business Model and Enterprise Adoption 00:57:04 Background Agents, Multi-Agent Systems, and CLI 01:00:41 Context Engineering and Memory 01:12:39 Team, Culture, and Closing Thoughts
--------
1:15:43
--------
1:15:43
Personalized AI Language Education — with Andrew Hsu, Speak
Speak (https://speak.com) may not be very well known to native English speakers, but they have come from a slow start in 2016 to emerge as one of the favorite partners of OpenAI, with their Startup Fund leading and joining their Series B and C as one of the new AI-native unicorns, noting that “Speak has the potential to revolutionize not just language learning, but education broadly”. Today we speak with Speak’s CTO, Andrew Hsu, on the journey of building the “3rd generation” of language learning software (with Rosetta Stone being Gen 1, and Duolingo being Gen 2). Speak’s premise is that speech and language models can now do what was previously only possible with human tutors—provide fluent, responsive, and adaptive instruction—and this belief has shaped its product and company strategy since its early days. https://www.linkedin.com/in/adhsu/ https://speak.com One of the most interesting strategic decisions discussed in the episode is Speak’s early focus on South Korea. While counterintuitive for a San Francisco-based startup, the decision was influenced by a combination of market opportunity and founder proximity via a Korean first employee. South Korea’s intense demand for English fluency and a highly competitive education market made it a proving ground for a deeply AI-native product. By succeeding in a market saturated with human-based education solutions, Speak validated its model and built strong product-market fit before expanding to other Asian markets and eventually, globally. The arrival of Whisper and GPT-based LLMs in 2022 marked a turning point for Speak. Suddenly, capabilities that were once theoretical—real-time feedback, semantic understanding, conversational memory—became technically feasible. Speak didn’t pivot, but rather evolved into its second phase: from a supplemental practice tool to a full-featured language tutor. This transition required significant engineering work, including building custom ASR models, managing latency, and integrating real-time APIs for interactive lessons. It also unlocked the possibility of developing voice-first, immersive roleplay experiences and a roadmap to real-time conversational fluency. To scale globally and support many languages, Speak is investing heavily in AI-generated curriculum and content. Instead of manually scripting all lessons, they are building agents and pipelines that can scaffold curriculum, generate lesson content, and adapt pedagogically to the learner. This ties into one of Speak’s most ambitious goals: creating a knowledge graph that captures what a learner knows and can do in a target language, and then adapting the course path accordingly. This level-adjusting tutor model aims to personalize learning at scale and could eventually be applied beyond language learning to any educational domain. Finally, the conversation touches on the broader implications of AI-powered education and the slow real-world adoption of transformative AI technologies. Despite the capabilities of GPT-4 and others, most people’s daily lives haven’t changed dramatically. Speak sees itself as part of the generation of startups that will translate AI’s raw power into tangible consumer value. The company is also a testament to long-term conviction—founded in 2016, it weathered years of slow growth before AI caught up to its vision. Now, with over $50M ARR, a growing B2B arm, and plans to expand across languages and learning domains, Speak represents what AI-native education could look like in the next decade. Chapters 00:00:00 Introductions & Thiel Fellowship Origins 00:02:13 Genesis of Speak: Early Vision & Market Focus 00:03:44 Building the Product: Iterations and Lessons Learned 00:10:59 AI’s Role in Language Learning 00:13:49 Scaling Globally & B2B Expansion 00:16:30 Why Korea? Localizing for Success 00:19:08 Content Creation, The Speak Method, and Engineering Culture 00:23:31 The Impact of Whisper and LLM Advances 00:29:08 AI-Generated Content & Measuring Fluency 00:35:30 Personalization, Dialects, and Pronunciation 00:39:38 Immersive Learning, Multimodality, and Real-Time Voice 00:50:02 Engineering Challenges & Company Culture 00:53:20 Beyond Languages: B2B, Knowledge Graphs, and Broader Learning 00:57:32 Fun Stories, Lessons, and Reflections 01:02:03 Final Thoughts: The Future of AI Learning & Slow Takeoff
--------
1:04:09
--------
1:04:09
AI Video Is Eating The World — Olivia and Justine Moore, a16z
When the first video diffusion models started emerging, they were little more than just “moving pictures” - still frames extended a few seconds in either direction in time. There was a ton of excitement about OpenAI’s Sora on release through 2024, but so far only Sora-lite has been widely released. Meanwhile, other good videogen models like Genmo Mochi, Pika, MiniMax T2V, Tencent Hunyuan Video, and Kuaishou’s Kling have emerged, but the reigning king this year seems to be Google’s Veo 3, which for the first time has added native audio generation into their model capabilities, eliminating the need for a whole class of lipsynching tooling and SFX editing. The rise of Veo 3 unlocks a whole new category of AI Video creators that many of our audience may not have been exposed to, but is undeniably effective and important particularly in the “kids” and “brainrot” segments of the global consumer internet platforms like Tiktok, YouTube and Instagram. By far the best documentarians of these trends for laypeople are Olivia and Justine Moore, both partners at a16z, who not only collate the best examples from all over the web, but dabble in video creation themselves to put theory into practice. We’ve been thinking of dabbling in AI brainrot on a secondary channel for Latent Space, so we wanted to get the braindump from the Moore twins on how to make a Latent Space Brainrot channel. Jump on in! Chapters 00:00:00 Introductions & Guest Welcome 00:00:49 The Rise of Generative Media 00:02:24 AI Video Trends: Italian Brain Rot & Viral Characters 00:05:00 Following Trends & Creating AI Content 00:07:17 Hands-On with AI Video Creation 00:18:36 Monetization & Business of AI Content 00:23:34 Platforms, Models, and the Creator Stack 00:37:22 Native Content vs. Clipping & Going Viral 00:41:52 Prompt Theory & Meta-Trends in AI Creativity 00:47:42 Professional, Commercial, and Platform-Specific AI Video 00:48:57 Wrap-Up & Final Thoughts
--------
49:27
--------
49:27

Más podcasts de Economía y empresa

Podcasts a la moda de Economía y empresa

Acerca de Latent Space: The AI Engineer Podcast

The podcast by and for AI Engineers! In 2024, over 2 million readers and listeners came to Latent Space to hear about news, papers and interviews in Software 3.0. We cover Foundation Models changing every domain in Code Generation, Multimodality, AI Agents, GPU Infra and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from OpenAI, Anthropic, Gemini, Meta (Soumith Chintala), Sierra (Bret Taylor), tiny (George Hotz), Databricks/MosaicML (Jon Frankle), Modular (Chris Lattner), Answer.ai (Jeremy Howard), et al. Full show notes always on https://latent.space

Sitio web del podcast

Economía y empresa Tecnología Empresa

Escucha Latent Space: The AI Engineer Podcast, Hágale como quiera y muchos más podcasts de todo el mundo con la aplicación de radio.net

Descarga la app gratuita: radio.net

Añadir radios y podcasts a favoritos
Transmisión por Wi-Fi y Bluetooth
Carplay & Android Auto compatible
Muchas otras funciones de la app

Abrir app

Descarga la app gratuita: radio.net

Añadir radios y podcasts a favoritos
Transmisión por Wi-Fi y Bluetooth
Carplay & Android Auto compatible
Muchas otras funciones de la app

Latent Space: The AI Engineer Podcast

Escanea el código,
Descarga la app,
Escucha.