← All Tags

#rag

81 episodes

#3751: Source-Restricted vs. Open Retrieval: How to Lock Down Your LLM

When should an LLM be locked to specific documents, and when should it search the web? A practical framework for grounding decisions.

ragai-safetylegal-technology

#3120: What Makes Agentic Search Tools Like Exa Actually Work?

Why swapping Google for Exa transformed our show's accuracy — and what agentic search does differently.

ai-agentsragsearch

#2705: Your Brain Isn't a Hard Drive — What Actually Fits

Long-term memory isn't storage — it's a generative model. Here's where the brain/computer analogy actually holds up.

neuroscienceraggenerative-ai

#2682: Live Retrieval vs. RAG: What an Agent Actually Does

Does every AI conversation create a tiny vector store? We unpack the real tradeoffs between live document fetching and pre-indexed RAG.

ragai-agentsvector-databases

#2676: Vector Database Schema Design for AI Memory Layers

Stop dumping vectors blindly. Design metadata schemas and namespaces for retrieval that actually works at scale.

vector-databasesragai-memory

#2673: The Embedding Coupling Problem: Editing Vector Stores

Can you edit or delete individual chunks in Pinecone? And can you actually back up a vector index? Yes—but with critical caveats.

vector-databasesragai-agents

#2664: Can You Trust an LLM's Raw Knowledge?

Why pre-trained knowledge isn't reliable for facts — and what actually makes models useful.

large-language-modelsfine-tuningrag

#2639: The Hidden Layer That Makes Search Work

Why your search results miss the mark — and how cross-encoders fix it.

ragsearchinformation-retrieval

#2638: How to Build Disposable AI Agents at Runtime

Create ephemeral AI agents that answer questions about specific items, then vanish. No persistent configuration needed.

ai-agentscontext-windowrag

#2469: Embedding Model Deprecation: RAG's Silent Killer

When OpenAI retires an embedding model, your RAG pipeline breaks silently. Here’s how to fix it.

ragmodel-context-protocolvector-databases

#2466: The Hidden Trap of Embedding Model Lock-In

What happens when your vector database works great — until your embedding model gets deprecated and your vectors become useless.

ragopen-sourceembedding-models

#2315: How to Update AI Models Without Starting Over

Exploring the challenge of updating AI models with new knowledge without costly full retraining.

ai-trainingfine-tuningrag

#2228: Tuning RAG: When Retrieval Helps vs. Hurts

How do you prevent retrieval from suppressing a model's reasoning? We diagnose our own pipeline's four control levers and multi-source fusion strat...

ragai-agentsprompt-engineering

#2214: The Three Failure Modes of AI News Systems

When a conflict changes hourly, AI systems built for yesterday's information fail. Here's how to architect pipelines that actually keep up.

large-language-modelsai-inferencerag

#2213: When Ground Truth Moves Hourly

How do you rigorously evaluate whether Tavily or Exa retrieves better results for breaking news? A formal benchmark beats the vibe check.

ragbenchmarkshallucinations

#2208: Building Memory for AI Characters That Actually Evolve

How do AI hosts develop real consistency across episodes? Corn and Herman explore retrieval-augmented memory systems that let AI characters genuine...

ai-memoryragconversational-ai

#2204: Memory Without RAG: The Real Architecture

mem0, Letta, Zep, and LangMem solve agent memory differently than RAG. Here's what's actually happening under the hood.

ai-agentsai-memoryrag

#2203: Knowledge Without Tools: Why MCPs Aren't Just for Execution

MCPs can be pure knowledge providers with zero tools. Here's why that matters for agents querying government data and authoritative sources.

model-context-protocolknowledge-graphsrag

#2181: When RAG Becomes an Agent

RAG in chatbots is simple retrieval. RAG in agents is a multi-step decision loop. Here's what actually changes.

ragai-agentsai-orchestration

#2133: Engineering Geopolitical Personas: Beyond Caricatures

How to build LLMs that simulate state actors with strategic fidelity, not just surface mimicry.

ai-agentsprompt-engineeringrag

#2129: Shifting Left on Hallucinations

Stop hoping your AI doesn't lie. We explore the shift to deterministic guardrails, specialized judge models, and the tools making agents reliable.

ai-agentshallucinationsrag

#2125: Why Agentic Chunking Beats One-Shot Generation

A single prompt can't write a 30-minute script. Here’s the agentic chunking method that fixes coherence.

ai-agentsprompt-engineeringrag

#2069: The Vibe Coding Trap: Why Your Agent Skills Keep Breaking

Stop guessing at the agentskills.io spec. Learn the exact YAML fields, directory structure, and authoring patterns to make Claude Code skills that ...

ai-agentsprompt-engineeringrag

#2057: How Agents Break Through the LLM Output Ceiling

The output window is the new bottleneck: why massive context doesn't solve long-form generation.

ai-agentscontext-windowrag