#4071: Four Tools for a Human Review Layer on AI Tags

Adding human review to AI-generated tags without building an admin panel from scratch.

Featuring
Listen
0:00
0:00
Episode Details
Episode ID
MWP-4250
Published
Duration
37:40
Audio
Direct link
Pipeline
V5
TTS Engine
chatterbox-regular
Script Writing Agent
deepseek-v4-pro

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

Daniel runs nearly four thousand AI-generated episodes in a Neon serverless Postgres database on Vercel. The automated tagging pipeline assigns categories and tags during episode creation, but for topic-specific feeds to be accurate, someone needs to review what the AI decided. Was "machine learning" the right tag, or was it really about "model evaluation"? The AI gets close but doesn't always nail the nuance.

The core constraint is avoiding a full admin panel build. Daniel is a single user who needs a table view of episodes, the ability to filter by tag status, and inline editing — essentially a spreadsheet connected to Postgres. The schema already exists: episodes, tags, categories, and junction tables. Whatever tool is chosen must connect to existing tables without migrating, duplicating, or corrupting the schema.

Four candidates emerge. Retool Embedded lets you build a drag-and-drop admin UI hosted on Retool's infrastructure and embed it as an iframe in Next.js — zero schema intrusion and native Vercel deployment. NocoDB connects to your Postgres URL and gives a spreadsheet view, but adds roughly twenty internal tables to your schema and requires a persistent Node.js runtime outside Vercel. Directus introspects your existing schema and generates an admin UI with namespaced metadata tables, also needing separate hosting. The custom route — a Next.js admin template with Prisma and Neon's serverless adapter — deploys natively on Vercel but requires a weekend of development time.

Vercel's serverless function timeout adds a runtime constraint: ten seconds on the Hobby plan, sixty on Pro. Bulk tag updates across hundreds of episodes can't loop through rows naively. Whatever tool is chosen must handle batching internally or work within those limits. Neon's connection pooling also matters — if the admin UI opens a new connection per query, the pool exhausts quickly. Prisma's Neon adapter handles this automatically, but third-party tools connecting directly via Postgres URL need to be connection-efficient.

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3
Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#4071: Four Tools for a Human Review Layer on AI Tags

Corn
Daniel sent us this one — and it's one of those problems where the question sounds simple but the answer touches everything about how we actually run this show. We've got nearly four thousand AI-generated episodes sitting in a Neon serverless Postgres database. The tagging pipeline runs automatically during episode creation, the AI assigns categories and tags, and then... that's it. It's a black box. Daniel wants to add a human review step — a way to manually fine-tune those tag assignments so the topic-specific feeds are actually accurate. But he doesn't want to build an admin panel from scratch. The question is: what's the fastest way to graft a lightweight admin UI onto an existing database without rebuilding the whole stack?
Herman
This is exactly the kind of problem I love, because the constraints are real and they're specific. He's on Vercel, he's using Neon serverless Postgres, and he wants something deployable in that same environment. The schema already exists — episodes, tags, categories, the junction tables — so whatever tool we pick has to connect to an existing database, not generate a new one. That immediately rules out a whole category of solutions.
Corn
The AI tagging pipeline is working, but for topic-specific feeds to be accurate, someone needs to actually look at what the AI decided. Was "machine learning" the right tag, or was it really about "model evaluation"? The AI gets close, but it doesn't always nail the nuance. Daniel's really asking: how do I add that review layer without turning into a full-stack developer for a week?
Herman
The instinct to avoid building from scratch is correct. This is a single-user internal tool. It's not customer-facing. The admin panel is not the product — it's a utility. Optimizing for speed of implementation over architectural purity makes total sense. A spreadsheet-like UI with filters and inline editing handles probably ninety percent of what he actually needs.
Corn
Which is why I think the answer might be simpler than people expect. When you hear "admin panel for a production database," you immediately think CMS, authentication layers, role-based access control — but Daniel's the only user. He needs a table view of episodes, the ability to filter by tag status, and inline editing. That's basically a spreadsheet connected to Postgres.
Herman
And there are tools that give you exactly that — a spreadsheet-like interface on top of your existing database — without writing a single line of frontend code. The trick is finding one that works with Neon's serverless Postgres and deploys cleanly on Vercel. Those two constraints narrow the field considerably.
Corn
Here's what we're going to do today. Daniel's prompt lays out four practical questions: what backend UI fits the serverless model, what options can be adapted to an existing data shape, how to keep the hybrid workflow where AI does first-pass tagging and a human reviews, and — this is the one I think is most interesting — how to set this up so the review process actually improves the AI over time, rather than just being a chore you do forever.
Herman
That feedback loop is the thing that turns this from a tool comparison into an architecture discussion. If you're going to review four thousand episodes worth of tags, you want every correction to make the next batch better. Otherwise you're just doing data entry with extra steps.
Corn
The good news is, the schema for this already exists. Daniel's got an episodes table, a tags table, a categories table, and the junction tables connecting them. The AI pipeline writes to these tables during episode creation. What's missing is the review layer — a few extra columns and an interface to read and write them. The database itself doesn't need to be redesigned.
Herman
We've got four candidate approaches on the table. Retool Embedded, which lets you build a drag-and-drop admin UI and embed it as a React component or iframe in your Next.NocoDB, which is open source and turns your existing Postgres database into a spreadsheet-like interface. Directus, which introspects your schema and generates an admin UI without touching your data tables. And the custom route — a Next.js admin template like Mantine Admin or React Admin, wired up with Prisma and Neon's serverless adapter.
Corn
Each of these has a different answer to the core question: how do you connect to an existing database without migrating, duplicating, or corrupting the schema? That's the technical thread we're going to pull on. But the bigger question — and this is what makes the episode worth listening to even if you don't have a podcast with four thousand episodes — is how to think about internal tools when you're a small team or a solo operator. When is "good enough" actually good enough? And when does the off-the-shelf option create more problems than it solves?
Herman
The Vercel deployment constraint is the real forcing function here. Serverless functions have a ten-second timeout on the Hobby plan, sixty seconds on Pro. If you're doing bulk tag updates across hundreds of episodes, you can't just write a naive endpoint that loops through rows — it'll time out. So whatever tool we recommend has to either handle batching internally or work within those limits.
Corn
Neon adds its own wrinkle. Serverless Postgres means connection pooling matters. If your admin UI opens a new database connection for every query, you'll exhaust the connection pool in about thirty seconds of enthusiastic clicking. Prisma's Neon adapter handles this automatically with driver-level connection pooling, but if you're using a third-party tool that connects directly via a Postgres URL, you need to make sure it's not connection-hungry.
Herman
Which brings us back to our four options. Each handles the connection pooling problem differently. Retool manages connections on their infrastructure. NocoDB and Directus run their own connection pools in the Node.The custom template with Prisma's adapter delegates pooling to Neon's serverless driver. Different architectures, same fundamental requirement: don't melt the database under concurrent requests.
Corn
That's the landscape. Four tools, one database, a bunch of AI-generated tags that need human eyeballs, and a strong preference for not building things from scratch. Let's start with the one that's probably the fastest path from zero to working prototype.
Herman
Before we dive into Retool, let's be clear on the actual architecture, because the constraints flow directly from how the pipeline is set up. The AI generates an episode script. Then an automated tagging pass runs, probably as part of the same serverless function, and writes rows into the episodes_tags junction table. Categories get assigned the same way. Then the episode gets published. The whole thing is fire-and-forget.
Corn
The schema for all of this already lives in Neon. The AI writes to these tables, the website reads from them. What Daniel wants is a review interface that slots into that same schema — same tables, same database, no migration to a new data store, no duplicating everything into a separate CMS.
Herman
That "no duplication" constraint eliminates most traditional CMS options. Something like Strapi or Payload expects to own the schema — you define your content types inside the CMS, and it creates the tables. That's the opposite of what we need. We already have the tables. The tool has to read them as-is.
Corn
Which is why Directus makes the shortlist. It introspects your existing Postgres schema, figures out what tables and relationships you have, and generates an admin UI on top of them. It doesn't create new tables for your data — it reads what's already there. The metadata Directus needs lives in separate, namespaced tables, so your schema stays clean.
Herman
That's a meaningful difference from NocoDB. NocoDB also connects to an existing database — you point it at your Postgres URL and it immediately gives you a spreadsheet view of every table — but it adds roughly twenty internal tables to your schema for audit logs, UI state, permissions, and internal bookkeeping. They're namespaced, they don't touch your data, but if you're browsing your schema in the Neon dashboard, you're going to see a lot of extra tables.
Corn
Whether that matters depends on how much you care about schema aesthetics. For a production database serving a public-facing website, I can see the argument for keeping things tidy. But for Daniel's use case — a single-user admin tool — I'm not sure twenty extra tables is actually a problem.
Herman
It's not a functional problem. The data stays intact. But it does mean that if you ever decide to stop using NocoDB, you have cleanup work. Directus takes the opposite approach — it keeps its metadata entirely separate from your data tables. Retool goes even further: it doesn't add anything to your schema at all. It connects, reads the structure, and lets you build queries and forms without ever modifying the database.
Corn
The schema-intrusion spectrum runs from Retool at the zero-impact end, through Directus in the middle with namespaced metadata tables, to NocoDB at the far end with twenty-plus internal tables added directly to your schema. Good to know. But let's talk about the deployment constraint, because this is where things get interesting for Daniel specifically.
Herman
Everything he runs is on Vercel, deployed alongside his Next.The ideal admin UI would deploy the same way — push to GitHub, Vercel builds it, it lives on the same domain at a sub-path like slash admin. No separate server, no additional infrastructure to manage.
Corn
That's the Vercel-shaped hole that several of these options fall into. Vercel runs serverless functions. They spin up, handle a request, and spin down. They don't maintain a persistent Node.NocoDB needs a persistent runtime — it's a full Express server under the hood. Directus needs one too. Neither can run as a Vercel serverless function without serious contortions.
Herman
Which means if Daniel picks NocoDB or Directus, he's adding infrastructure. A five-dollar-a-month Railway app, or a small VPS. It's not a dealbreaker — five bucks a month is basically noise — but it's a second deployment target, a second thing to monitor, a second set of environment variables to manage. For someone explicitly trying to keep things simple, that friction matters.
Corn
Retool Embedded sidesteps this entirely because it's hosted on Retool's infrastructure. You build the admin UI in their drag-and-drop editor, they host it, and you embed it in your Next.js app via an iframe or their React SDK. No additional server to manage. The tradeoff is that your admin UI now depends on Retool's uptime and Retool's pricing.
Herman
The custom Next.js template route — using something like Mantine Admin or React Admin with Prisma and the Neon serverless adapter — deploys natively on Vercel. It's just more Next.js code in your existing project. The cost is development time: you're wiring up CRUD views, building the review workflow UI, handling inline editing state. Even with a template, that's a weekend of work, maybe two.
Corn
The deployment constraint sorts our four options into two buckets. Native Vercel deployment: Retool Embedded via iframe, or a custom Next.js admin template. Requires additional hosting: NocoDB and Directus, both needing a persistent Node.js runtime somewhere outside Vercel.
Herman
There's one more constraint that could actually cause runtime failures if we get it wrong: Vercel's serverless function timeout. Ten seconds on Hobby, sixty seconds on Pro. If Daniel's reviewing tags and decides to bulk-update two hundred episodes at once, that operation needs to complete within the timeout window. If it doesn't, the function gets killed mid-transaction.
Corn
That's not theoretical. Four thousand episodes, each with maybe five to ten tags — that's potentially tens of thousands of rows to update in a single bulk operation. The tool either needs to batch those updates internally, or Daniel needs to be disciplined about updating in chunks of fifty to a hundred episodes at a time.
Herman
Retool handles this on their backend — their query engine manages the connection and the batching. NocoDB and Directus run their own servers, so they're not constrained by Vercel's timeout at all. The custom template route is where you have to be most careful: you'd need to implement batching in your API route, splitting a bulk update across multiple requests if necessary.
Corn
To summarize the constraints before we dig into each tool: connect to an existing schema without requiring migration, deploy as close to Vercel-native as possible, keep it lightweight for a single user, and handle bulk operations within serverless timeout limits. Those four constraints are the lens we're going to use. Let's start with Retool.
Herman
Retool Embedded is the one that makes me say "this shouldn't work as well as it does." You give it a Postgres connection string — same one your Next.js app uses — and it introspects the schema immediately. Tables, columns, foreign keys, all of it shows up in their data explorer. From there you drag a table component onto a canvas, point it at the episodes table, and you've got a working admin view in about five minutes.
Corn
The writeback is the part that actually matters for Daniel's use case. He's not just browsing — he needs to edit tag assignments and have those changes persist to the database. Retool handles that with what they call "writeback" — you configure a table column to be editable, and when you change a value, it generates the UPDATE query and executes it against your database. No API layer to build, no endpoint to wire up.
Herman
The generated query uses the same connection string, same credentials, same connection pooling Retool manages on their side. You're not building a middleware layer — you're giving Retool direct database access and letting their query engine handle the rest. For a single-user internal tool, that's not a security concern, it's a feature. You skip an entire tier of backend code.
Corn
The embedding story is where it gets interesting for the Vercel constraint. Retool Embedded lets you take the app you built and surface it inside your Next.They give you a React component — you install their SDK, drop the component into a page at slash admin, and pass it the app ID. It renders inside an iframe, but it's styled to match your app, and you can pass context like "current user" or "selected episode" through the SDK.
Herman
The iframe part is worth flagging. It's not a native React component rendering your data — it's a hosted Retool app running on their infrastructure, displayed inside your page. Every interaction goes through Retool's servers. If Retool has an outage, your admin panel is down even if your main site is fine.
Corn
The pricing as of mid-twenty-twenty-six is ten dollars per user per month for the basic plan, with a free tier for up to five users. For Daniel as a single admin, that's either free or ten bucks a month. The free tier gets you most of the core features — the main limitation is on the number of apps and some advanced components. For a tag review dashboard, the free tier is probably sufficient.
Herman
The lock-in is the real tradeoff, not the price. Your admin UI lives in Retool's editor. If you ever want to migrate away, you're rebuilding the entire interface from scratch — you can't export it as code. You're trading development speed now for portability later. For an internal tool that might be used for years, that's worth thinking about.
Corn
Which brings us to NocoDB, the open-source alternative that takes a fundamentally different approach. Instead of a drag-and-drop UI builder, it gives you what is essentially Airtable on top of your Postgres database. You point it at your database URL, and every table becomes a spreadsheet-like grid with filtering, sorting, and inline editing. No canvas, no components — just rows and columns.
Herman
The connection mechanism is straightforward. You provide a Postgres connection string, NocoDB reads the information schema to discover your tables and relationships, and builds its UI from that. But — and this is the detail that matters — it also creates its own metadata tables in your database. About twenty of them. Tables for storing view configurations, audit logs, user permissions, shared view links, and internal state.
Corn
They're namespaced — usually prefixed with "nc_" or similar — so they won't collide with your data tables. But if you connect NocoDB to a database that already has ten tables, you're going to end up with roughly thirty tables in your schema browser. For someone who likes a tidy database, that's a papercut. Functionally, it's harmless. Aesthetically, it's clutter.
Herman
The real deployment issue with NocoDB is that it's a persistent Node.js application running an Express server under the hood. You can't deploy it to Vercel's serverless functions — it needs a runtime that stays alive between requests. Your options are Docker on a VPS, Railway, Render, or any platform that can run a Node.About five dollars a month on Railway gets it done.
Corn
That's the same story with Directus. It's a full Node.js application with file system access for uploads and configuration. It expects to be deployed as a long-running service, not a serverless function. If you want Directus on Vercel, you're looking at a workaround — running it on Railway or a small VPS as a sidecar, then pointing your Next.js app at it.
Herman
Directus does have one advantage that's relevant here: its schema introspection is genuinely read-only. It connects to your Postgres database, reads the system catalog to understand your tables, columns, and relationships, and generates an admin UI from that. It does not create tables in your schema. Its own metadata lives in a completely separate database — or, if you configure it to use the same database, in a separate schema namespace. Your data tables stay exactly as they were.
Corn
The generated UI is more opinionated than NocoDB's spreadsheet. Directus gives you a proper CMS interface — list views, detail views, form layouts, relationship pickers. For reviewing individual episode tags, the detail view is actually nicer than a spreadsheet grid. You open an episode, see all its tags in a relationship field, and edit inline. It feels like a purpose-built admin panel, not a database viewer.
Herman
The permissions system is also more mature. Directus lets you define roles with field-level access control. For a single-user tool that's overkill, but if Daniel ever wanted to add a second reviewer — say, someone who can view tags but not edit them — Directus handles that natively. NocoDB has permissions too, but they're coarser.
Corn
The cost is the same deployment friction. You're adding a second service to manage. Environment variables, updates, monitoring — it's not a lot of work, but it's not zero. And for someone whose entire stack currently deploys with a single git push to Vercel, adding a Railway sidecar is a meaningful change to the development workflow.
Herman
The fourth option is the one that stays entirely within Vercel: a custom Next.js admin template. Something like Mantine Admin or React Admin, wired up with Prisma and the Neon serverless adapter. You install the template, define your Prisma schema to match your existing database — Prisma has an introspection command that reads your Postgres schema and generates the schema file for you — and then you build CRUD views for episodes, tags, and categories.
Corn
React Admin is particularly good at this pattern. You define a data provider that points at your Next.js API routes, and it auto-generates list views, edit forms, and filters based on your Prisma models. It's not drag-and-drop, but it's also not building from scratch. You're configuring components, not writing them. A weekend of work gets you a fully functional admin panel that deploys on Vercel like any other Next.
Herman
Prisma's Neon adapter is what makes this viable in a serverless environment. Neon's serverless Postgres scales to zero when idle, which means cold starts. A traditional connection pooler keeps connections alive, but serverless functions don't. The Neon adapter uses HTTP-based connection pooling — it opens connections over HTTP rather than persistent TCP sockets, which eliminates the cold start penalty and keeps you within Vercel's ten-second timeout window for individual queries.
Corn
The tradeoff is that you're now responsible for the review workflow logic. Retool and NocoDB give you inline editing out of the box — with the custom template, you're building that yourself. And that's where the hybrid workflow stops being a nice-to-have and becomes the thing that actually justifies building this at all. Because if you're just manually fixing tags forever, you've built yourself a part-time job. The point of adding the review layer is to create a feedback loop — every correction you make should make the next batch of AI assignments better.
Herman
This is the part I find exciting. Imagine Daniel reviews a hundred episodes. He corrects twenty percent of the tag assignments — the AI called something "machine learning" when it should have been "deep learning," or it missed a secondary tag entirely. Those corrections aren't just fixes. They're labeled training data. You've got the original AI-assigned tag, the human-corrected tag, and the episode context. That's a supervised learning dataset that grows every time you sit down to review.
Corn
The pipeline doesn't need to be complicated. You log every correction as a row in a review history table — episode ID, original tag, corrected tag, timestamp. When you've accumulated enough corrections, you feed them back into the AI prompt as examples. "Here are twenty episodes where the initial tagging was wrong and here's what the correct tags should have been. Use these as guidance for future assignments.
Herman
Even simpler: you can use those corrections as few-shot examples in the prompt itself. The next time the AI tags an episode about model evaluation, it sees that historically, episodes discussing benchmark methodology got tagged "benchmarking" rather than "model evaluation," and it adjusts. No fine-tuning required — just better prompting informed by real correction data.
Corn
Which means the admin UI isn't just a review tool. It's a data collection instrument. Every click of "approve" or "edit" is building a dataset that makes the AI pipeline smarter. If you review a hundred episodes and correct twenty percent, that's twenty episodes worth of high-quality, human-verified tag assignments you didn't have before. Do that across four thousand episodes and you've got a training corpus that's worth real money.
Herman
The practical implication for the database schema is small but important. Right now Daniel's episodes_tags junction table probably has episode_id, tag_id, and maybe a created_at timestamp. To support the review workflow, you need two new columns: review_status and confidence_score. The review_status is an enum — pending, reviewed, needs_review. The confidence_score is a float between zero and one that the AI sets during initial tagging.
Corn
Adding these to Neon without downtime is trivial. Postgres ALTER TABLE ADD COLUMN with a DEFAULT value is a metadata-only operation in modern Postgres — it doesn't rewrite the table. You run ALTER TABLE episodes_tags ADD COLUMN review_status TEXT DEFAULT 'pending', and every existing row instantly has 'pending' without a table scan. Same for confidence_score with DEFAULT zero point zero. The AI pipeline doesn't even know these columns exist — it keeps writing tags exactly as before.
Herman
The confidence_score column is the one that unlocks the smart review workflow. If the AI is ninety-five percent confident in a tag assignment, you probably don't need to review it. If it's forty percent confident, that episode should surface at the top of your review queue. The admin UI filters on confidence_score below some threshold — say, below zero point seven — and shows you only the episodes where the AI was uncertain. That shrinks a four-thousand-episode review backlog to maybe eight hundred episodes that actually need human attention.
Corn
This is where the tool choice starts to matter for workflow, not just deployment. Retool makes it easy to build a filtered view with conditional formatting — episodes below the confidence threshold highlighted in yellow, inline editing for tags, a bulk approve button for everything above the threshold. NocoDB can do the filtering and inline editing, but the conditional logic is clunkier — you're working with spreadsheet formulas, not a proper expression editor.
Herman
The case study that convinced me Retool is the right starting point: a podcast network with ten thousand episodes built a tag review dashboard in two days using Retool Embedded. The UI grouped episodes by category, showed inline tag editing with color-coded confidence scores, and had bulk approve and reject buttons. The entire backend was read and write to their existing Postgres — no new API endpoints, no middleware, no schema changes beyond the two columns we just described. Two days from zero to fully functional review workflow.
Corn
That's the benchmark. Compare that to the custom Next.js template route, where you're spending a weekend just building the CRUD views before you even get to the review workflow logic. The template gives you more control and zero lock-in, but it costs you a week of evenings that you could have spent actually reviewing tags.
Herman
NocoDB sits in the middle. You can have it connected to your database in thirty minutes, the spreadsheet UI is immediately usable for inline editing, and the filtering works. But the UX for reviewing individual episodes — opening an episode, seeing all its tags in context, making nuanced decisions about whether "reinforcement learning" is a better fit than "deep learning" — it's clunkier than Retool's form-based interface. NocoDB shines at bulk operations. It's less good at the detailed, one-episode-at-a-time review where you're reading the title and description and making a judgment call.
Corn
Which is why the recommendation shapes up as a phased approach. Start with Retool Embedded for the fastest path to a working prototype — an afternoon to connect the database, build the table view, add the confidence score filter, and embed it in the Next.Use it for a week. Review a hundred episodes. See if the lock-in bothers you. If it does, NocoDB on Railway for five dollars a month gives you the spreadsheet UI without Retool's per-user pricing and without depending on their uptime. The migration is just pointing NocoDB at the same database URL.
Herman
Neon's branching feature makes this whole evaluation process safer. You create a development branch from your production database — it's instant, it's isolated, and it doesn't affect the live site. You point Retool or NocoDB at the branch, test the review workflow, verify that the new columns don't break anything, and only merge back to production when you're confident. If NocoDB's metadata tables turn out to be more annoying than expected, you just drop the branch and start over.
Corn
The starter skeleton philosophy here is worth naming explicitly. Daniel said he wants to avoid building from scratch, and that instinct is correct for this use case. The admin panel is not a differentiator. Nobody listens to this podcast because of the elegant tag management interface. It's a utility — like the electrical panel in your basement. You don't need it to be beautiful, you need it to work reliably and not require you to become an electrician.
Herman
That's the meta-lesson I keep coming back to. For internal tools that serve one person, "good enough" is the target. A spreadsheet-like UI with filters and inline editing is perfectly adequate for reviewing four thousand episodes. The temptation is to over-engineer — to build a beautiful admin dashboard with animations and dark mode and a custom design system. Every hour you spend polishing the admin UI is an hour you're not spending reviewing tags, and the tag reviews are what actually improve the listener experience.
Corn
The Vercel timeout constraint reinforces this. On the Hobby plan, you've got ten seconds per function invocation. That means you can't bulk-update all four thousand episodes in one click — you need to batch. Retool handles batching on their backend. NocoDB and Directus don't have the timeout constraint at all because they run on persistent servers. The custom template route requires you to implement batching yourself. For someone who wants to avoid building things, that's another point in favor of the hosted options.
Herman
Here's the three-step plan. Step one: connect Retool Embedded to your Neon database. Grab the connection string from the Neon dashboard — the same one your Next.js app uses — plug it into Retool's data source config. That takes about five minutes. Then drag a table component onto the canvas, point it at the episodes table, add a join to episodes_tags, and you've got a working view with inline tag editing. Test it with fifty episodes. The whole thing should take an afternoon.
Corn
Step two: add those two columns to your junction table before you start reviewing in earnest. Run ALTER TABLE episodes_tags ADD COLUMN review_status TEXT DEFAULT 'pending' and ALTER TABLE episodes_tags ADD COLUMN confidence_score REAL DEFAULT zero point zero. That's it. The AI pipeline doesn't change — it keeps writing tags exactly as before. But now your admin UI has columns to read and update. Filter on confidence_score below zero point seven, and your review queue shrinks from four thousand episodes to the few hundred that actually need attention.
Herman
Step three: review fifty episodes, collect the corrections, and feed them back into the AI prompt as few-shot examples. You don't need a machine learning pipeline. You just need a growing list of "here's what the AI got wrong, here's what it should have been" that gets appended to the tagging prompt. Every batch of reviews makes the next batch of AI assignments more accurate. That's the feedback loop that turns this from a chore into an investment.
Corn
If the Retool lock-in starts to itch — maybe you don't like depending on their uptime, or the pricing changes, or you just want everything under your own roof — the migration path is straightforward. Spin up NocoDB on Railway for five dollars a month, point it at the same database URL, and you've got a spreadsheet UI with no per-user pricing and no dependency on a third-party service. The two extra columns you added in step two are just regular Postgres columns — NocoDB picks them up automatically.
Herman
The -lesson I want to land is this: for internal tools, the bar is lower than you think. A spreadsheet-like grid with filters and inline editing handles ninety percent of what Daniel needs. You don't need a polished CMS. You don't need role-based access control for a single user. You don't need a custom design system. The ten-second Vercel timeout means you batch updates in groups of fifty to a hundred episodes, not four thousand at once — and honestly, that's better workflow anyway. Reviewing in chunks is less overwhelming than staring at a four-thousand-row spreadsheet.
Corn
The admin panel is not the product. The tags are the product — or more precisely, the topic-specific feeds that accurate tags enable. Every hour spent building a beautiful admin interface is an hour not spent making those feeds better. Start with the thing that gets you reviewing tags fastest, and only upgrade when the tool actively prevents you from doing the work.
Herman
The question that sticks with me, though, is how you know the feedback loop is actually working. You've reviewed two hundred episodes, you've logged every correction, you're feeding them back into the prompt — but is the AI actually getting better? Or are you just fixing the same kinds of mistakes over and over?
Corn
That's the measurement problem. The simplest approach is to track the correction rate over time. Week one, you review a hundred episodes and correct twenty-two percent of the tag assignments. Week four, after feeding those corrections back into the prompt, you review another hundred and correct fourteen percent. That delta tells you the loop is working. If the correction rate stays flat, your few-shot examples aren't actually changing the AI's behavior — you need better examples, or more of them, or a different prompt structure.
Herman
This is where logging every correction with the before and after values pays off. You're not just collecting a count of corrections — you're building a dataset of specific failure patterns. The AI consistently confuses "natural language processing" with "large language models." It over-tags episodes about AI safety as "ethics" when "alignment" is more precise. Those patterns become the examples you prioritize in the prompt. You're not just throwing corrections at the model — you're teaching it where its blind spots are.
Corn
The longer-term evolution of this is interesting too. Right now Daniel's looking at reviewing everything — four thousand episodes, all tags, full manual pass. But as the AI improves, the review step shifts from "review everything" to "review the edge cases." The confidence_score column is what enables that shift. Start with a threshold of zero point seven — review everything below it, spot-check above it. As the AI gets better, raise the threshold to zero point eight, then zero point nine. Eventually you're only reviewing the five percent of episodes where the AI is uncertain.
Herman
Neon's branching feature opens up an even more interesting possibility. You could create separate database branches with different AI tagging models — one branch using the current prompt, another using a refined prompt with your latest corrections — and compare the tag assignments side by side. Same episodes, different AI outputs, same review interface. That's A/B testing for your tagging pipeline, and it doesn't require any additional infrastructure beyond what Neon already provides.
Corn
That's where this stops being a tool comparison and becomes a content quality investment. Every correction you make isn't just fixing one episode's tags — it's making every future episode more discoverable. The topic-specific feeds get more accurate. Listeners find the episodes they actually want. The four-thousand-episode back catalog becomes more valuable because it's better organized. That's the return on investment for a weekend of setup and an hour of review a week.
Herman
If you've got a similar problem — an existing database, no admin panel, and a strong preference for not building one from scratch — we'd love to hear what you ended up using. The Retool-to-NocoDB path worked for Daniel's constraints, but there are a dozen other tools in this space and someone out there has probably found a better one.
Corn
Now: Hilbert's daily fun fact.

Hilbert: In the nineteen sixties, researchers studying bats in Tuvalu discovered that a single horseshoe bat emits echolocation calls at roughly one hundred and ten kilohertz — which, if you convert it to a unit people actually use, is about sixty-six million wingbeats per fortnight. The bat, for its part, did not seem impressed by the math.
Herman
I have so many questions about that unit conversion and I'm going to choose not to ask any of them.
Corn
A wise choice. This has been My Weird Prompts. Thanks to our producer Hilbert Flumingtop for the fact that will haunt my nap later. If you enjoyed this episode, do us a favor and leave a review wherever you listen — it helps. Find us at my weird prompts dot com. We'll be back soon.

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.