← All Tags

#large-language-models

140 episodes

#3816: How to Stop AI Scripts From Falling Apart

Why long-form AI generation breaks down and how hierarchical memory fixes it.

large-language-modelscontext-windowai-reasoning

#3814: The Day We Lost Our Minds: What Temperature Does to an AI

A two-host autopsy of the day the podcast's AI hosts briefly lost coherence due to excessive sampling temperature, and what it reveals about how language models actually work.

large-language-modelsai-reasoninghallucinations

#3767: How LLMs Actually Learn: Stages or Slurry?

Do large language models learn grammar first, then facts? The honest answer is messier and more fascinating.

large-language-modelsai-trainingemergent-abilities

#3664: Build Your Own Language Dictionary: Beyond Standard Definitions

Ditch standard dictionaries and build your own curated vocabulary from real encounters with native speakers.

linguisticslarge-language-modelsknowledge-management

#3596: Why an AI Model Kept Calling Itself Sonnet 4.6

When a Chinese model insists it's "Sonnet 4.6," is it theft, sloppy training, or something stranger?

large-language-modelsfine-tuningtraining-data

#3595: How DeepSeek Feels More Open Than Western AI

Why Chinese AI models sometimes feel less censored on American political topics than American models do.

large-language-modelsai-ethicscultural-bias

#3553: Can AI Review Your Lease in Israel?

Can AI actually understand Israeli tenant law? We explore the tools, the gaps, and how to build your own.

tenant-rightslarge-language-modelslegal-technology

#3424: Catching Up on AI Without the Firehose

Four curated sources that filter AI noise into signal — Import AI, The Batch, Stanford HAI, and a podcast.

large-language-modelsai-ethicsai-history

#3406: LoRA Isn’t Just for Image Generation

LoRA lets you fine-tune an LLM’s behavior with a 50MB file. Here’s how it works and why it matters.

large-language-modelsfine-tuninglow-rank-adaptation

#3283: Fine-Tuning DeepSeek for One Podcast

Can a purpose-specific fine-tune fix a model's stubborn writing tics? We explore the practical engineering behind it.

fine-tuninglarge-language-modelsai-training

#3278: How to Get Early AI Model Access as a Solo Developer

How a solo developer spending $300/month can get early access to new AI models before the press release.

large-language-modelsprompt-engineeringapi-integration

#3271: LLMs as Parsers, Not Calculators

Stop letting LLMs do math. Use them to parse messy text, then let deterministic code handle the numbers.

large-language-modelsprompt-engineeringmodel-context-protocol

#3171: How to Break an LLM's Bad Verbal Habits

Blacklists fail and regex inverts meaning. Here's what actually works to clean up AI writing tics.

large-language-modelsprompt-engineeringfine-tuning

#3157: Opus 4.8: What Actually Changed Under the Hood

Anthropic dropped Opus 4.8 with no fanfare. New training data, faster inference, and smarter refusals — here's what changed.

large-language-modelsfine-tuningmodel-collapse

#3127: Crafting AI Characters That Feel Alive

Move beyond system prompts with structured character bibles that give AI personalities real inner lives.

large-language-modelsai-agentsgenerative-ai

#2672: When a Startup Claims to Break the Quadratic Wall

A startup claims linear attention scaling at 12M tokens, beating GPT-5.5 on retrieval benchmarks.

large-language-modelscontext-windowbenchmarks

#2664: Can You Trust an LLM's Raw Knowledge?

Why pre-trained knowledge isn't reliable for facts — and what actually makes models useful.

large-language-modelsfine-tuningrag

#2651: AI Training Itself: Student, Teacher, and Grader

Can models generate their own training data and judge their own outputs? The promise and pitfalls of fully AI-led pipelines.

large-language-modelsai-trainingmodel-collapse

#2650: How to Catch an LLM's Bad Writing Habits

A practical guide to analyzing podcast transcripts for repetitive language and dialogue patterns — from Python word counts to embedding clustering.

large-language-modelsprompt-engineeringfine-tuning

#2622: How Transformers Actually Work: Attention, Tokens, and Context

How one architectural change unlocked chatbots, image generation, and protein folding — explained without the jargon.

transformerslarge-language-modelsgpu-acceleration

#2488: Hybrid Pipelines for Entity Resolution

Classic NLP pipelines vs. lightweight LLMs for handling Hezbollah’s half-dozen spellings.

large-language-modelsiranisrael

#2464: Batch APIs: The 50% Discount You're Probably Misusing

Batch inference APIs offer 50% off — but only for the right workloads. Here's when they actually make sense.

large-language-modelsai-inferencegpu-acceleration

#2461: How Claude Code's Conversation Compaction Actually Works

The three-tier system, what survives, what dies, and why you shouldn't rely on auto-compact.

large-language-modelsai-agentsprompt-engineering

#2426: Why DeepSeek V4's Prose Feels More Vivid Than Claude or GPT

A million-token context window at 2% the KV-cache cost — and prose that actually breathes. Here's what makes V4 different.

large-language-modelsopen-source-aifine-tuning