<span class="category-dot" style="background-color: #f59e0b" data-astro-cid-qascswou> AI Applications

ai-memoryragconversational-ai

#2208: Building Memory for AI Characters That Actually Evolve

How do AI hosts develop real consistency across episodes? Corn and Herman explore retrieval-augmented memory systems that let AI characters genuine...

ai-agentsprompt-engineeringsoftware-development

#2207: Specs First, Code Second: Inside Agentic AI's New Era

As AI coding agents evolve from autocomplete to autonomous cloud workers, the bottleneck has shifted—now it's about how clearly you specify what ne...

ai-agentscontext-windowai-memory

#2205: When AI Coding Agents Forget: Five Approaches to Context Rot

As coding agents handle longer sessions, they accumulate noise and lose crucial information. Five competing frameworks are solving this differently...

model-context-protocolknowledge-graphsrag

#2203: Knowledge Without Tools: Why MCPs Aren't Just for Execution

MCPs can be pure knowledge providers with zero tools. Here's why that matters for agents querying government data and authoritative sources.

ai-agentsai-alignmentai-safety

#2194: Game Theory for Multi-Agent AI: Design Better, Fail Less

Nash equilibrium, mechanism design, and why your AI agents are playing prisoner's dilemma whether you know it or not.

prompt-engineeringspeech-recognitiontext-to-speech

#2192: How We Built a Podcast Pipeline

Hilbert reveals the complete technical architecture behind 2,000+ episodes—from voice memos to GPU-powered TTS, with Claude models, LangGraph workf...

ai-agentsprompt-engineeringai-reasoning

#2191: Making Multi-Agent AI Actually Work

Research from Google DeepMind, Stanford, and Anthropic reveals most multi-agent systems waste tokens and amplify errors. Single agents with better ...

ai-agentsai-reasoningai-safety

#2189: Scaling Multi-Agent Systems: The 45% Threshold

A landmark Google DeepMind study reveals that adding more AI agents often degrades performance, wastes tokens, and amplifies errors—unless your sin...

ai-agentsai-safetyhuman-in-the-loop-ai

#2185: Taking AI Agents From Demo to Production

Sixty-two percent of companies are experimenting with AI agents, but only 23% are scaling them—and 40% of projects will be canceled by 2027. The ga...

ai-agentsagent-cost-optimizationai-inference

#2184: The Economics of Running AI Agents

Production AI agents can cost $500K/month before optimization. Learn model routing, prompt caching, and token budgeting to cut costs 40-85% without...

ai-agentsai-reasoninghuman-computer-interaction

#2182: Can You Actually Review an AI Agent's Plan?

Most AI agents have plans the way you have a plan while half-asleep—something's happening, but you can't see it. We map the five major planning pat...

ai-agentsfault-toleranceai-inference

#2179: Building Cost-Resilient AI Agents

Failed API calls in agent loops aren't just technical problems—they're direct budget drains. Here's how checkpointing, retry strategies, and cachin...

ai-agentsbenchmarksai-safety

#2178: How to Actually Evaluate AI Agents

Frontier models score 80% on one agent benchmark and 45% on another. The difference isn't the model—it's contamination, scaffolding, and how the te...

prompt-engineeringreasoning-modelsai-reasoning

#2175: Let Your AI Argue With Itself

What happens when you let multiple AI personas debate each other instead of asking one model one question? A deep dive into synthetic perspective e...

ai-agentsprompt-engineeringai-orchestration

#2174: Role-Playing as Orchestration

How a role-playing protocol from NeurIPS 2023 became one of AI's most underrated agent frameworks—and what happens when you scale it to a million a...

ai-agentsknowledge-graphsai-reasoning

#2173: Inside MiroFish's Agent Simulation Architecture

MiroFish generates thousands of AI agents with distinct personalities to predict social dynamics. But research reveals a critical flaw: LLM agents ...

ai-agentslarge-language-modelsmilitary-strategy

#2171: How IQT Labs Built a Wargaming LLM (Then Archived It)

A deep code review of Snowglobe, IQT Labs' open-source LLM wargaming system that ran real national security simulations before being archived. What...

ai-agentsai-safetyprompt-engineering

#2170: Pricing Agentic AI When Nothing's Predictable

How do you charge fixed prices for systems that operate in fundamental uncertainty? Consultants are discovering frameworks that work—but they requi...

ai-agentsai-safetysoftware-development

#2169: How Enterprises Are Rethinking Agent Frameworks

Twelve major agentic AI frameworks exist—yet many serious developers avoid them entirely. What patterns emerge in real enterprise adoption?

ai-agentsai-orchestrationsoftware-development

#2168: What Serious Agentic AI Developers Actually Need to Know

Python, TypeScript, LangGraph, and the frameworks reshaping how agents work. A technical map of the skills and concepts that separate prototypes fr...

ai-agentsmodel-context-protocoldistributed-systems

#2167: Sync vs. Async: Architecting Agents for Scale

Why most enterprise AI agents fail in production has less to do with models and more to do with whether they're built synchronously or asynchronously.

ai-agentssoftware-developmentapi-integration

#2166: Code vs. Canvas: How Developers Pick Their Tools

LangGraph or Flowise? The honest answer isn't obvious. Developers gain speed and integrations with visual builders—but lose version control, testin...

ai-agentsai-orchestrationprompt-engineering

#2165: Strip Your Agent to Bash

The frameworks matter less than you think. What separates a working agent from a failing one is the harness—the orchestration, memory, and tool des...