AI

Artificial intelligence, machine learning, and everything LLM

871 episodes Page 11 of 44

#2251: Agent-to-Agent Protocols: What Actually Needs Standardizing

When autonomous agents call other agents, what does a working protocol actually require? Exploring session handling, state management, security, an...

ai-agentsapi-integrationsecurity

#2250: Where AI Safety Researchers Actually Work

Vendor labs, independent research orgs, government agencies—the AI safety field is messier and more diverse than most people realize. A map of wher...

ai-safetyai-alignmentanthropic

#2249: Building Custom Benchmarks for Agentic Systems

Public benchmarks fail for agentic systems. Learn how to build evaluation frameworks that actually predict production behavior.

ai-agentsbenchmarksai-inference

#2246: Constitutional AI: Anthropic's Theory of Safe Scaling

How Anthropic's Constitutional AI replaces human raters with AI self-critique guided by explicit principles—and what it assumes about the future of...

anthropicai-safetyai-alignment

#2243: What Enterprise AI Pricing Actually Negotiates

Enterprise customers rarely get the deep discounts they expect from AI APIs. What they actually negotiate for—and why the ramp-up requirement exist...

large-language-modelsai-inferenceenterprise-hardware

#2242: AI as Your Ideation Blind Spot Spotter

How to use AI not to answer questions you already know to ask, but to surface possibilities your expertise has made invisible to you.

prompt-engineeringlarge-language-modelsai-agents

#2241: When More Frameworks Make Worse Decisions

Benjamin Franklin's 250-year-old pro/con list still dominates how we decide—but research shows it's riddled with bias. We map five frameworks that ...

human-factorsproductivityai-reasoning

#2239: How AI Benchmarks Became Broken (And What's Replacing Them)

The tests we use to measure AI progress are contaminated, saturated, and gamed. Here's what's actually working.

benchmarkstraining-dataai-reasoning

#2233: Who Actually Wants AI to Slow Down?

Daniel argues AI development should slow down for expertise and stability. But who in the industry actually shares this philosophy beyond the obvio...

ai-safetyai-alignmentlarge-language-models

#2228: Tuning RAG: When Retrieval Helps vs. Hurts

How do you prevent retrieval from suppressing a model's reasoning? We diagnose our own pipeline's four control levers and multi-source fusion strat...

ragai-agentsprompt-engineering

#2224: Why AI Can't Crack the Voynich Manuscript

A fifteenth-century text has defeated cryptanalysts, linguists, and AI models alike. What does its resistance tell us about language, encoding, and...

cryptographylinguisticsai-reasoning

#2221: What Podcasts Should You Actually Listen To?

Two AI hosts curate 12 podcasts for curious minds—and ask whether an AI can actually have taste in the first place.

conversational-aicontent-provenanceai-memory

#2219: Spec-Driven Life: How AI Planning Beats Project Paralysis

What makes AI agents reliably productive? A structured spec that externalizes memory and chunks work into manageable pieces. Can the same framework...

claude-codeprompt-engineeringproductivity

#2214: Real-Time News at War Speed: Building AI Pipelines for Breaking Conflict

When a conflict changes hourly, AI systems built for yesterday's information fail. Here's how to architect pipelines that actually keep up.

large-language-modelsai-inferencerag

#2213: Grading the News: Benchmarking RAG Search Tools

How do you rigorously evaluate whether Tavily or Exa retrieves better results for breaking news? A formal benchmark beats the vibe check.

ragbenchmarkshallucinations

#2208: Building Memory for AI Characters That Actually Evolve

How do AI hosts develop real consistency across episodes? Corn and Herman explore retrieval-augmented memory systems that let AI characters genuine...

ai-memoryragconversational-ai

#2207: Specs First, Code Second: Inside Agentic AI's New Era

As AI coding agents evolve from autocomplete to autonomous cloud workers, the bottleneck has shifted—now it's about how clearly you specify what ne...

ai-agentsprompt-engineeringsoftware-development

#2206: What Actually Works in AI Memory

Most AI memory systems are just vector databases with similarity search. We break down what mem0, Zep, and Letta are actually doing—and why benchma...

ai-memoryvector-databasesknowledge-graphs

#2205: When AI Coding Agents Forget: Five Approaches to Context Rot

As coding agents handle longer sessions, they accumulate noise and lose crucial information. Five competing frameworks are solving this differently...

ai-agentscontext-windowai-memory

#2204: Memory Without RAG: The Real Architecture

mem0, Letta, Zep, and LangMem solve agent memory differently than RAG. Here's what's actually happening under the hood.

ai-agentsai-memoryrag