Daniel sent us this one, and it's got layers. His son Ezra is ten months old, already has a YouTube Kids profile, and Daniel found himself watching something called Piggy Visits the Doctor. Low-budget Hebrew animation, a family of pigs in a waiting room. And instead of just thinking "this is for toddlers," his brain went to "how was this made, and could I make my own?
Which is exactly the right question to ask right now. Because on one hand you've got this explosion of generative video models — Sora, Kling, Runway Gen-3, Pika two-point-oh — and the demos make it look like you can just describe a cartoon and it materializes. But then you watch something like Piggy Visits the Doctor and realize, wait, that wasn't made by typing a sentence into a box. Somebody built that. Somebody animated those pigs frame by frame. So what's actually happening?
The gap between "AI can make a clip" and "AI can make an episode" is where this whole conversation lives. And Daniel's not asking about Pixar. He's asking about the stuff that's actually within reach.
He specifically said he wants to understand the Piggy tier, not the Toy Story tier. And that's smart, because the Piggy tier is where indie creators actually operate. It's also where AI is genuinely useful right now — not as a magic wand, but as a tool in a pipeline that still requires human storytelling, human voice acting, human decisions about what a scene should feel like.
He also mentioned his own characters — Herman, Pobellberry, and Coron — and that he'd love to make an animated series with them someday. Which is not hypothetical for him. He's looking at YouTube Kids and thinking, could I be in there?
The answer is yes. But the distance between "yes, conceptually" and "here's your finished seven-minute episode" is a lot of very specific work. So let's map that distance.
Before we do — the thing that struck me about his prompt is that he installed YouTube Kids preemptively. Ezra's ten months old. He's not watching yet. But Daniel's already in there, scouting the terrain, watching Hebrew pig cartoons, and somewhere in that process the parent brain and the creator brain collided.
That's the moment we're capturing. The moment where you realize the content your kid will consume is made by real people with real tools, and you start wondering if you could be one of those people.
Let's break down what it actually takes to make a kids' animation in twenty twenty-six. From the Piggy tier to the Koala Brothers tier. And let's be honest about where AI helps, where it doesn't, and what you'd actually need to learn on a Monday morning if you wanted to start.
I want to start with a number that always resets my brain on this topic. Toy Story, nineteen ninety-five. Eight hundred thousand render hours. A hundred-machine render farm at Pixar. A single frame took between four and thirteen hours to render.
I remember reading that as a kid and just not being able to process it. A whole room of computers running for months to make a movie about toys that come to life.
Today, a single NVIDIA RTX forty-ninety — a consumer GPU you can buy for about sixteen hundred dollars — can render a similar-quality frame in under a second using Unreal Engine five. The rendering bottleneck is gone. What took Pixar a hundred machines and months of compute now happens in real time on a card that fits in a desktop tower.
If rendering isn't the bottleneck anymore, what is?
The bottleneck shifted from compute to creativity. Specifically, to pre-production and consistency. Script, storyboard, character design, voice direction, and then the actual frame-by-frame decisions about how a character moves, how they express emotion, how a scene is composed. The computer can draw the frame instantly now. Deciding what to draw is still a human job.
That's where we need to talk about the three tiers Daniel's actually asking about. Because "animation" is not one thing.
Right, so tier one is what everyone pictures when they hear "animation" — Pixar, Disney, DreamWorks. Budgets in the hundred-plus-million range, teams of hundreds, render farms that still exist but now they're doing things like simulating every individual strand of fur. That's not what Daniel's asking about.
Tier three is Piggy Visits the Doctor. The thing he actually watched. Low-budget, probably made by one to three people, limited animation. And tier two sits right in the middle — that's The Koala Brothers, or Bluey, or any of those eleven-minute TV episodes that have actual production value but aren't trying to be Toy Story.
Here's why these tiers matter. When you say "I want to make an animated series," you're not just picking an art style. You're picking a production reality. Budget, timeline, team size, skill requirements — they're completely different across these three tiers.
Let's define them. Tier one, theatrical — not our focus today, but it's useful as the ceiling. A Pixar-quality film in twenty twenty-six still takes four to five years and a team of two hundred to three hundred people. The rendering is faster, but the creative decisions — story, character design, shot composition, lighting — those haven't been automated. They've just gotten more ambitious.
Tier two is where The Koala Brothers lives. Stop-motion or digital two-D animation, eleven-minute episodes, a crew of eight to twelve people. You've got a director, two or three animators, a couple of background artists, a rigger, a compositor, a sound designer, at least one voice actor. An episode takes six to eight weeks, and the budget is somewhere between three hundred thousand and five hundred thousand dollars per episode.
Then tier three. Piggy Visits the Doctor. This is probably made in Adobe Animate or Toon Boom Harmony, running at twelve frames per second — or even eight frames per second on twos, meaning a new drawing every two frames. Simple rigged characters, reused backgrounds, maybe a team of one to three people. A seven-minute episode at that quality level takes two to four weeks for a solo creator working full-time, and the cost is somewhere between two thousand and ten thousand dollars total.
That two-to-four-week timeline for a solo creator — that's the number I want to sit with, because it's what makes Daniel's question answerable. A motivated person with the right tools can produce a seven-minute kids' animation in about a month. That's not theory. That's what small studios and independent creators are doing right now on YouTube Kids.
Where does AI fit into this? Because that's the real question. It's not replacing the pipeline. It's compressing parts of it.
That's exactly what we need to dig into next. Because you can't understand where AI helps until you understand what it's replacing in the traditional pipeline.
The traditional pipeline for a seven-minute episode — let's walk through it stage by stage, because each one has a different relationship to AI. Pre-production comes first. Script, storyboard, animatic. This is where you figure out what actually happens in your episode. You write the dialogue, you sketch out maybe sixty to eighty panels showing key moments, and then you build an animatic — which is basically those storyboard panels timed out to a scratch voice track, like a slideshow with sound. That's your blueprint.
That scratch voice track is where you first hear whether your pig has a personality or just reads lines.
Pre-production for a seven-minute Piggy-tier episode — maybe three to five days if you're moving fast. Then you hit production. This is where the actual animation happens, and it breaks down into layout, keyframe animation, in-betweening, coloring, and compositing. Layout is camera placement and rough character positioning within each scene. Keyframe animation is drawing the important poses — the start of a movement, the end of a movement, the emotional beats. In-betweening fills the frames between those key poses. Coloring is self-explanatory. Compositing layers everything together — characters over backgrounds, foreground elements, any effects.
At the Piggy tier, you're not doing full twenty-four frames per second. You're running at twelve frames per second on twos — so six new drawings per second. Sometimes even eight frames per second on twos, which is four drawings per second. That's why it looks a little choppy. It's not a bug, it's a budget.
And that's the core trade-off in tier three animation. Every drawing you skip is time you save. A seven-minute episode at twelve frames per second on twos is about two thousand five hundred individual drawings. At eight frames per second on twos, you're down to about sixteen hundred. That's the difference between a two-week timeline and a three-week timeline for a solo animator.
Piggy Visits the Doctor — let's reverse-engineer what we saw. Simple vector rigs, meaning the characters aren't redrawn from scratch every frame. You build a digital puppet once, with movable joints, and then you pose it. Adobe Animate or Toon Boom Harmony. The backgrounds are probably reused across scenes — that waiting room set appears multiple times, maybe with minor variations. The lip sync is basic, maybe three or four mouth shapes mapped to the Hebrew dialogue.
That rigging approach is why a solo creator can actually finish an episode. If you had to hand-draw every frame from scratch, you'd be looking at months, not weeks. The rig is the productivity multiplier. You invest a day or two building a character rig — defining the skeleton, the joint constraints, the facial expression library — and then you can animate that character for the entire series.
This is also why you see so many YouTube Kids shows where characters have, let's say, limited limb articulation. A pig that mostly stands in place and gestures with one arm is a rig that took four hours to build. A pig that does backflips is a completely different engineering problem.
Post-production comes last. Sound design — footsteps, door creaks, ambient waiting-room murmur. Voiceover — either recorded by the creator themselves or hired out. Final render, which at this quality level is basically real-time. You hit export and wait maybe ten minutes for a seven-minute episode.
The full timeline for a solo creator at the Piggy tier: three to five days pre-production, two to three weeks production, two to three days post-production. That's your two to four weeks. And the cost — if you're doing everything yourself — is essentially zero beyond software subscriptions. Adobe Animate is about twenty-five dollars a month. Toon Boom Harmony, maybe forty. If you hire a voice actor, add a hundred to three hundred dollars.
Now contrast that with The Koala Brothers tier. Stop-motion with digital compositing, eleven-minute episodes. You're now looking at six to eight weeks with a crew of eight to twelve people. The director oversees the creative vision. Two to three animators handle the actual puppet manipulation — and in stop-motion, you're shooting at twelve or fifteen frames per second, physically moving an armature a fraction of a millimeter between each frame. Two background artists build and maintain the miniature sets. A rigger maintains the puppets — stop-motion armatures break constantly. A compositor removes rigging equipment from shots and layers in digital elements. A sound designer handles foley and mixing. At least one professional voice actor.
The budget reflects all those salaries. Three hundred to five hundred thousand per episode. That's not extravagant — that's just what it costs to pay ten skilled people for two months of work.
What's interesting is that in the tier two world, AI is mostly showing up in pre-production and post, not in the core animation. You'll see studios using Midjourney or Stable Diffusion for concept art — rapid visualization of character designs and set concepts before committing to physical builds. ElevenLabs gets used for voice prototyping during the animatic phase, so you can hear the episode before you book your voice talent. But the actual animation is still human hands on puppets or human hands on Cintiq tablets.
That brings us to the tools that are actually changing things for indie creators right now. Runway's image-to-video is useful for background animations — you can generate a looping background, trees swaying or clouds moving, and composite your rigged characters over it. That used to require either drawing dozens of background frames or licensing stock footage.
ElevenLabs has become the standard for voice prototyping across indie animation. The Annecy International Animation Film Festival reported that over forty percent of indie pilots submitted in twenty twenty-five used AI-generated voices at some stage of development. Not for final audio — SAG-AFTRA rules and basic quality standards still push you toward real voice actors for the finished product. But for prototyping, for hearing your characters speak before you've committed to a script, it's transformative.
Then there's the character consistency problem. Which is the thing that stops AI from being the whole pipeline. No current video generation model — not Sora, not Kling, not Runway Gen-3 — can keep a character's face, clothing, and body proportions identical across a seven-minute runtime. Each clip is generated independently, and the model has no persistent memory of what your pig looks like. You get subtle drift. Or un-subtle drift.
The workaround that smart indie creators are using — and this goes back to what Corridor Digital pioneered with their Rock Paper Scissors series — is to build a simple rigged three-D model in Blender as your consistency anchor. Animate the character's motion traditionally, then use AI for style transfer — applying a consistent visual look to every frame. The three-D base guarantees the character stays on-model. The AI handles the aesthetic.
The AI isn't drawing your pig. It's painting your pig after you've already told it where to stand and how to move.
Which is a much less magical story than "type a prompt, get an episode," but it's a much more honest one. And for someone like Daniel, who has characters already — Herman, Pobellberry, Coron — this workflow is accessible. Build simple rigs, animate them, use AI for the visual polish.
Of course there are.
Let's talk about the tools that are actually changing things for indie creators right now, because the landscape is moving fast. Runway Gen-3 Alpha dropped in July twenty twenty-four, and its video-to-video style transfer is the thing that's useful for this workflow. You can take live-action reference footage — someone acting out a scene with a puppet or even just gesturing — and Runway will rotoscope that into an animated character in a consistent style.
Corridor Digital's Rock Paper Scissors series from twenty twenty-four is the case study everyone points to. They built simple three-D models in Blender as the base, animated the motion traditionally, and then ran the frames through AI style transfer to get that painterly anime look. The three-D rig guaranteed the characters stayed on-model — same face, same proportions, same clothing across every shot. The AI handled the aesthetic layer. It wasn't "type a prompt, get an episode." It was a hybrid pipeline.
Which is exactly what someone making a Piggy Visits the Doctor tier show would do. Build a simple pig rig in Blender — Rigify, the free auto-rigging add-on, will get you a basic skeleton in about twenty minutes. Animate your waiting room scene with the rigged pig. Then pipe the frames through Runway or Pika for style transfer — give it that flat vector look, or a watercolor children's book feel, whatever fits your show.
Pika two point zero added something specific that matters for dialogue-heavy kids' content. Lip-sync from audio. You feed it a character face and a voice clip, and it generates the mouth shapes automatically. That launched in November twenty twenty-five, and it's not Pixar-quality lip sync — you get occasional mushiness — but for the Piggy tier, where you're doing three or four mouth shapes anyway, it's a genuine time saver.
The lip-sync problem is one of those things where "good enough" actually is good enough for a certain tier of production. A toddler watching a pig talk to a doctor is not analyzing phoneme accuracy. They're watching the pig.
That's the tier where AI is most useful — not replacing the creative decisions, but removing the tedious stuff that doesn't add creative value. Nobody became an animator because they love drawing in-between frames of a character's arm moving from point A to point B. That's grunt work. AI can handle a lot of that in-betweening now, especially with ComfyUI workflows using ControlNet and IP-Adapter for consistent character generation across frames.
Explain the ComfyUI piece, because that's where the actual indie workflow lives.
ComfyUI is a node-based interface for Stable Diffusion. You build a visual pipeline by connecting boxes — it looks like a modular synthesizer. With IP-Adapter, you feed it reference images of your character from multiple angles, and it learns to generate new poses of that same character without drifting. ControlNet lets you guide the generation with a stick figure or depth map, so you can specify "the character is standing here, arm raised, facing left" and the AI fills in the details while respecting the pose. Combine those, and you can generate a sequence of frames where your pig character stays consistent because you're constantly feeding it reference images and pose guides.
You're not asking the AI to remember your pig. You're showing it your pig every single time.
That's the workaround for the memory problem. Current video models have no persistent state — every generation is amnesiac. But if you build a pipeline that re-injects reference images and pose data on every frame, you fake consistency. It's not elegant, but it works.
Which brings us to the practical timeline for a solo creator using these tools. A seven-minute Piggy-tier episode, traditional pipeline, we said two to four weeks. With AI-assisted workflows — three-D base plus style transfer for backgrounds and effects, AI lip-sync, AI-assisted in-betweening — you're looking at one to two weeks. The AI saves forty to sixty percent of production time on the visual side.
— and this is the part that gets lost in the hype — the script, the storyboard, the voiceover, and the sound design still take the same amount of time. AI doesn't write a funny script. It doesn't know that Ezra is going to laugh when the pig falls off the chair, not when the pig explains medical billing. The human creative decisions are the bottleneck, and they should be.
There's a concept that's been floating around indie animation circles called generative pre-vis. Instead of spending three days sketching a storyboard and another two days building an animatic, you use AI to rapidly prototype scenes. Type a rough scene description, generate ten visual variations in different styles, see what clicks. Test whether your pig character looks better in flat vector or watercolor or cut-paper style before you commit to building the rig.
This is where AI is most valuable for indie creators — not in the final product, but in reducing the risk of investing weeks into a style that doesn't work. You can fail faster. You can try five visual approaches in an afternoon and pick the one that feels right. That used to take weeks of concept art iteration.
Generative pre-vis is basically the AI version of what Pixar does with their internal screenings — they watch rough versions of their films, realize the entire third act doesn't work, and redo it. But Pixar can afford to throw away months of work. An indie creator can't. AI pre-vis lets you throw away an afternoon instead.
Now, there's a regulatory layer here that matters specifically for kids' content, and we should touch on it. COPPA — the Children's Online Privacy Protection Act — applies to any content directed at children under thirteen. If you're making a kids' show for YouTube Kids, you cannot collect data on your viewers. No behavioral targeting, no personalized ads, no tracking. That's not an AI issue specifically, but it shapes the entire ecosystem. YouTube Kids has separate content policies, and they've been cracking down on what they call "low-effort AI slop" — auto-generated content with no human creative input.
Which is a legitimate problem. There are channels pumping out hundreds of AI-generated videos with nonsensical scripts and melting characters. But the enforcement also creates a weird incentive. If you're using AI as an assistive tool in a real creative pipeline, your content is fine. If you're fully automating everything including the script, you get demonetized or removed. The line is human creative oversight.
Then there's the voice acting piece. SAG-AFTRA has been in ongoing negotiations around AI voice actors, and the rules are still evolving. ElevenLabs is incredible for prototyping — you can generate multiple character voices from a single actor's scratch recordings, hear your whole episode before you book talent. But for the final product, especially if you ever want to submit to festivals or pitch to distributors, you need real voice actors. The good news is that for a seven-minute Piggy-tier episode, hiring a voice actor is surprisingly affordable — a hundred to three hundred dollars on platforms like Voices dot com or Fiverr.
If you're doing a Hebrew-language show, there's actually a shortage of Hebrew voice actors on those platforms, which means less competition but also fewer options. You might end up doing the voices yourself, or recruiting a friend. Which, honestly, for a first pilot, is completely fine. The Koala Brothers didn't start with professional voice talent either.
The throughline here — and this connects back to what we said about the Piggy tier being the sweet spot — is that the tools are mature enough now that a motivated solo creator with a clear vision can actually make something. Not a Pixar film. Not even a Koala Brothers episode. But a three-to-five-minute pilot with consistent characters, a real story, and visual charm. That's achievable in four to six weeks. The AI accelerates the visual production. The human provides the story, the timing, the jokes, the heart.
If you're that person — the one with characters in your head who wants to make a kids' show — what do you actually do on Monday morning? Here's the concrete plan. Step one: download Blender. It's free, it runs on anything, and the community is enormous. Follow a Rigify tutorial — Rigify is the auto-rigging add-on that ships with Blender — and build one simple character. Your pig, your donkey, your sloth. Get it moving.
The reason you start in three-D even if you want a two-D final look is that the rig solves the consistency problem before it starts. Once your character has a skeleton, it stays on-model by definition. Then you pipe those frames through Runway or Pika for style transfer, and suddenly your Blender render looks like flat vector animation or watercolor or whatever aesthetic you chose. That's the hybrid pipeline Corridor Digital used, and it's the most reliable indie workflow right now.
Step two: focus your time where the AI can't help. Write a real script. Draw a storyboard — even stick figures on index cards. Record scratch voiceover. Use ElevenLabs for prototyping if you want to hear how different voice styles feel against your character, but plan to hire a real voice actor for the final. A hundred to three hundred dollars for a seven-minute episode on Voices dot com or Fiverr. That's less than a decent dinner out.
Step three: target the Piggy Visits the Doctor tier for your pilot. Three to five minutes, simple animation, a clear beginning middle and end. Do not try to make The Koala Brothers on your first attempt. A four-minute pilot with consistent characters and an actual story is achievable in four to six weeks as a solo creator. Ship that, learn from it, then decide if you want to scale up.
The barrier to entry has never been lower. Blender is free. Rigify tutorials are on YouTube. Runway has a free tier. ComfyUI is open source. The only thing standing between you and a finished pilot is four to six weeks of focused work and a story worth telling.
The story part is the thing that will actually determine whether Ezra, or any kid, keeps watching. AI can paint your pig. It can't make your pig funny.
Which brings me to the question I keep coming back to. Will AI ever solve character consistency to the point where a solo creator can actually produce Pixar-level animation from their laptop? And I think the honest answer is — not in the next three to five years. Possibly not in the next ten. The gap between "consistent enough for a toddler watching a pig" and "consistent enough for a ninety-minute theatrical release" is not a technical gap, it's a creative one. Pixar animation isn't just technically flawless. Every frame has a director's intent baked into it. The lighting communicates mood. The character's posture communicates subtext. AI can't do subtext.
What it can do, though, is make the Piggy Visits the Doctor tier accessible to basically anyone with a story and four weeks of focus. And as the video models improve — Sora's expected to get a public release later this year, Kling two-point-oh is promising better temporal consistency — the bottleneck shifts further from production to storytelling. The creators who succeed won't be the ones with the best prompt engineering. They'll be the ones who know how to write a joke that lands, how to pace a scene, how to make a four-year-old care about a sloth and a donkey.
Which brings us to the obvious closing thought. The person who sent in this prompt has the characters. Herman, Pobellberry, Coron. They have names, they have voices, they have a dynamic. The hard part — the part AI can't do — is already done.
The tools are ready. Blender is free. The tutorials are on YouTube. The AI accelerators are maturing fast. The audience is waiting — and I don't just mean Ezra, though he's obviously the most important test viewer. There's a real hunger for kids' content that isn't algorithmically extruded, that has actual human warmth and weirdness and love behind it.
Start the pilot. Herman and Coron in a waiting room. I don't know what happens, but you do.
Now: Hilbert's daily fun fact.
Hilbert: In the early Renaissance, a Spanish explorer in Guyana documented a nudibranch whose chemical defenses were potent enough that a single sea slug, roughly the length of a human thumb, produced toxins comparable in effect to the venom of twenty mature rattlesnakes — a measurement the explorer recorded by observing its effect on a captive caiman.
...right.
This has been My Weird Prompts. Thanks to our producer, Hilbert Flumingtop. If you enjoyed this episode, leave us a review wherever you listen — it helps. We're at myweirdprompts dot com. I'm Corn.
I'm Herman Poppleberry. Go make something.