#ai-agents

344 episodes · Page 3 of 15

#2460: Shopping in a Fragmented Market

The real challenges of building an AI agent that navigates Hebrew e-commerce, geographic shipping quirks, and whitelist curation.

ai-agentslocal-aibrowser-automation

Apr 26

#2459: Drizzle vs Prisma: Which ORM Wins for AI-Native Backends?

Comparing Drizzle and Prisma for AI-native backends, MCP servers, and the future of agent-centric development.

ai-agentssoftware-developmentopen-source

Apr 26

#2458: Can Graph Databases Go Mainstream?

Graph databases are powerful but niche. Will they ever power mainstream CRMs and ERPs?

graph-databasesai-agentsvector-databases

Apr 26

#2453: Escaping the AI Doom Loop in Hiring

What if job matching was built on desire, not desperation? How one signal outperforms 100 applications.

ai-agentshuman-computer-interactionproductivity

Apr 26

#2441: When One Sentence Beats Four Clicks

What happens when you ditch the admin panel and let AI agents manage your systems directly?

ai-agentsmodel-context-protocolapi-integration

Apr 26

#2440: Build Your Own CRM With AI Agents

Off-the-shelf CRMs are built for sales teams, not solo operators. Here's why building your own with AI might be smarter.

ai-agentsdiysmall-language-models

Apr 26

#2439: AI Collapses the Framework Decision

Why Airtable fails for multi-user tools, and how AI builders are changing the framework decision for small businesses.

software-developmentproductivityai-agents

Apr 25

#2404: What Tool-Calling Benchmarks Miss About Production Failures

BFCL, tau-bench, and Nexus each reveal different failure modes. None of them test what actually kills production agents.

ai-agentsbenchmarkshallucinations

Apr 25

#2403: Choosing Your LLM Eval Framework

An architectural shootout of four major LLM evaluation harnesses — where each shines and where each breaks down.

large-language-modelsai-agentsbenchmarks

Apr 21

#2361: The Cognitive Load of Logs

How Claude Code transforms Linux system administration by automating log analysis and proactive maintenance.

ai-agentsautomationclaude-code

Apr 20

#2340: How AI Models Track a Ship Seizure’s Ripple Effects

When the US seized an Iranian cargo ship, three AI models reshuffled their predictions overnight. Here’s what they saw—and where they disagreed.

geopolitical-strategymaritime-explorationai-agents

Apr 19

#2336: How ADRs Solve AI's Institutional Memory Problem

Architectural Decision Records (ADRs) aren’t just documentation—they’re a way to give AI coding assistants the context they lack.

software-developmentai-agentslegacy-systems

Apr 19

#2328: How to Spot a Real AI Hackathon

Discover how to identify worthwhile AI hackathons, build meaningful connections, and maximize your impact in virtual communities.

ai-agentsdistributed-systemsproductivity

Apr 18

#2297: How to Scrape Geo-Restricted Israeli Sites with MCP Tools

Learn how to bypass advanced bot-protection on Israeli websites using MCP tools, residential IPs, and tunneling techniques.

ai-agentsgeo-blockingnetwork-security

Apr 17

#2274: Weekend Projects Gone Wild: Evaluating AI Startup Pitches

From fridge tax agents to guilt-scheduled cron jobs, we evaluate ten AI-driven startup ideas that could exist—but probably shouldn’t.

ai-agentsvoice-cloningsmart-home

Apr 16

#2254: How to Test an AI Pipeline Change

When you tweak one part of a complex AI agent system, how do you know if it actually improved anything? The answer lies in engineering checkpoints.

ai-agentsai-inferenceai-training

Apr 16

#2253: Why AI Agents Get Three Steps, Not Infinity

Why do AI agents get exactly three rounds of tool use? It's a critical guardrail against infinite loops and runaway costs, not a limit on intellige...

ai-agentsai-safetyautomation

Apr 16

#2251: Agent-to-Agent Protocols: What Actually Needs Standardizing

When autonomous agents call other agents, what does a working protocol actually require? Exploring session handling, state management, security, an...

ai-agentsapi-integrationsecurity

Apr 16

#2249: Building Custom Benchmarks for Agentic Systems

Public benchmarks fail for agentic systems. Learn how to build evaluation frameworks that actually predict production behavior.

ai-agentsbenchmarksai-inference

Apr 16

#2242: AI as Your Ideation Blind Spot Spotter

How to use AI not to answer questions you already know to ask, but to surface possibilities your expertise has made invisible to you.

prompt-engineeringlarge-language-modelsai-agents

Apr 15

#2228: Tuning RAG: When Retrieval Helps vs. Hurts

How do you prevent retrieval from suppressing a model's reasoning? We diagnose our own pipeline's four control levers and multi-source fusion strat...

ragai-agentsprompt-engineering

Apr 13

#2207: Specs First, Code Second: Inside Agentic AI's New Era

As AI coding agents evolve from autocomplete to autonomous cloud workers, the bottleneck has shifted—now it's about how clearly you specify what ne...

ai-agentsprompt-engineeringsoftware-development

Apr 13

#2205: When AI Coding Agents Forget: Five Approaches to Context Rot

As coding agents handle longer sessions, they accumulate noise and lose crucial information. Five competing frameworks are solving this differently...

ai-agentscontext-windowai-memory

Apr 13

#2204: Memory Without RAG: The Real Architecture

mem0, Letta, Zep, and LangMem solve agent memory differently than RAG. Here's what's actually happening under the hood.

ai-agentsai-memoryrag