← All Tags

#ai-training

31 episodes

#3767: How LLMs Actually Learn: Stages or Slurry?

Do large language models learn grammar first, then facts? The honest answer is messier and more fascinating.

large-language-modelsai-trainingemergent-abilities

#3283: Fine-Tuning DeepSeek for One Podcast

Can a purpose-specific fine-tune fix a model's stubborn writing tics? We explore the practical engineering behind it.

fine-tuninglarge-language-modelsai-training

#2665: Partner Certs vs Personal Certs: What Actually Matters

Solo operators face structural barriers in vendor partner programs. Here's how personal and partner certifications actually differ.

anthropiccloud-computingai-training

#2651: AI Training Itself: Student, Teacher, and Grader

Can models generate their own training data and judge their own outputs? The promise and pitfalls of fully AI-led pipelines.

large-language-modelsai-trainingmodel-collapse

#2559: The Smartest Path to Python for AI

A practical guide to the best courses and platforms for learning Python, specifically for machine learning.

software-developmentai-trainingpython-for-ai

#2431: The 3 Markets in an AI Trench Coat

GPUs, LPUs, and ASICs: why the best hardware for AI depends entirely on what you're trying to do.

gpu-accelerationai-inferenceai-training

#2408: How Backpropagation Actually Unlocks Neural Networks

How error signals flow backward through networks to make learning possible — and why "it's just calculus" misses the point.

transformersai-trainingai-history

#2377: Is Geopolitical Neutrality a Sustainable AI Strategy?

How DeepSeek carved a niche with efficiency, neutrality, and innovative dialogue handling — and what it means for AI's future.

ai-trainingai-modelsgeopolitical-strategy

#2368: The Multi-Stage Pipeline Behind Netflix's Recommendations

Unpacking the multi-stage AI pipeline behind Netflix, Spotify, and Amazon’s "you might also like" suggestions—from candidate generation to real-tim...

ai-modelsdata-storageai-training

#2355: Why Open-Weight Models Are Winning

Discover how Cogito v2.1 leverages process supervision and MoE architecture to redefine reasoning efficiency in open-weight AI models.

large-language-modelsopen-sourceai-training

#2315: How to Update AI Models Without Starting Over

Exploring the challenge of updating AI models with new knowledge without costly full retraining.

ai-trainingfine-tuningrag

#2313: When AI Optimizes the Wrong Thing

Discover how AI systems learn to optimize for rewards—and why they sometimes get it dangerously wrong.

ai-trainingai-alignmentai-ethics

#2307: Inside Frontier LLM Training: Stages, Costs, and Checkpoints

Discover the multi-stage process of training frontier large language models, from pretraining to post-training, and why checkpoints are the key to ...

large-language-modelsai-trainingfine-tuning

#2287: Is AI Code Generation the Future of Low-Code?

Exploring the rise of AI code generation and its potential to reshape the low-code movement.

software-developmentai-trainingfuture-of-work

#2272: The AI Transcription Sweet Spot

Does higher-quality audio make AI transcription worse? New research reveals a surprising "sweet spot" for bitrate, challenging a core assumption of...

speech-recognitionaudio-processingai-training

#2254: How to Test an AI Pipeline Change

When you tweak one part of a complex AI agent system, how do you know if it actually improved anything? The answer lies in engineering checkpoints.

ai-agentsai-inferenceai-training

#2196: The Invisible Workforce Behind AI

Annotation is the invisible foundation of AI—and a $17B industry by 2030. Here's what dataset curators actually need to know about the tools, platf...

training-dataai-trainingfine-tuning

#2188: Is Emergence Real or Just Bad Metrics?

The debate over whether AI models exhibit genuine emergent abilities or just appear to because of how we measure them—and why it matters for safety...

emergent-abilitiesai-traininginterpretability

#2187: Why Claude Writes Like a Person (and Gemini Doesn't)

Claude produces prose that sounds human. Gemini reads like Wikipedia. The difference isn't capability—it's how they were trained to think about wri...

large-language-modelsfine-tuningai-training

#2092: Why AI Thinks You're American (Even When You're Not)

Even when we tell Gemini we're in Jerusalem, it defaults to US-centric assumptions. We explore the root causes of this persistent AI bias.

cultural-biasai-ethicsai-training

#2064: Why GPT-5 Is Stuck: The Data Wall Explained

The "bigger is better" era of AI is over. Here's why the industry hit a data wall and shifted to a new scaling law.

large-language-modelsai-trainingdata-storage

#2063: That $500M Chatbot Is Just a Base Model

That polite chatbot? It started as a raw, chaotic autocomplete engine costing half a billion dollars to build.

large-language-modelsgpu-accelerationai-training

#2016: Andrej Karpathy: The Bob Ross of Deep Learning

Why the most influential AI mind prefers a blank text file to proprietary black boxes.

ai-trainingopen-source-aiai-reasoning

#1882: The Hidden Human Labor Behind AI

AI isn't free—it costs billions for humans to label data. See why annotation is the real engine behind models like Gemini.

ai-trainingdata-integritysupply-chain