Speech & Audio

Speech-to-Text

Whisper, ASR, transcription, voice typing

14 episodes

Breaking the Voice Wall: The Future of Native Speech AI

Explore why native speech-to-speech AI is 20x more expensive than text pipelines and how "semantic VAD" is solving the awkward silence problem.

large-language-modelslocal-aispeech-to-speech

Teaching AI to Hear: Solving the Custom Dictionary Dilemma

Tired of AI mishearing brand names? Learn how to build efficient custom dictionaries for Gemini 1.5 without breaking the bank.

automatic speech recognitioncustom dictionariesgemini 1.5context bloatdynamic hint system

Unsung Hero: The Gooseneck Mic's AI Power

The gooseneck mic: a humble hero with surprising AI power. Discover its secret to crystal-clear speech-to-text accuracy!

gooseneck micspeech-to-textmicrophoneAI voice captureaudio technology

From Lawyers in Limousines to Developers in Their PJs: The Voice Tech Revolution

From limo-riding lawyers to pajama-clad coders, voice tech is booming. Discover how AI is making it a force for good.

voice-technologyaccessibilityProductivity

The Multimodal Audio Revolution: A Screen-Free Future?

Is multimodal audio the future? We explore if AI can truly displace traditional speech-to-text for a screen-free world.

multimodal audiospeech-to-textscreen-freeaudio AIaccessibility

Personalizing Whisper: The Voice Typing Revolution

Voice typing is changing everything. Join us as we explore the revolution of personalizing Whisper!

speech-recognitionfine-tuningtransformers

Mic Check: Mastering AI Dictation Hardware

Uncover the secrets to perfect AI dictation! Corn and Herman explore the ultimate speech-to-text hardware.

large-language-modelsspeech-recognitionaudio-hardware

Building Custom ASR Tools

Ever wondered how to build your own ASR tools from scratch? Discover the why and how in this episode!

ASRspeech recognitioncustom asrmachine learningspeech to text

How To Fine Tune Whisper

Build your own AI transcription tool! We'll walk you through fine-tuning Whisper, from data to notebook.

fine-tuningspeech-recognitiongpu-acceleration

Benchmarking Custom ASR Tools - Beyond The WER

Benchmarking custom ASR fine-tunes: We're diving deep beyond the WER to truly measure performance.

ASRbenchmarkingwerspeech recognitionfine-tuning

Fine-Tuning ASR For Maximal Usability

Fine-tuned ASR is just the start. Discover the next steps for deployment and maximizing usability.

ASRspeech recognitionfine-tuningdeploymentusability

How ASR Went From Frustration To ... Whisper Magic

Speech to text: from frustrating to fantastic. Uncover the magic behind its rapid rise and connection to the AI boom!

automatic-speech-recognitionspeech-to-textasr-technology

Safetensors or something else: STT inference formats explained

Unpacking ASR weight formats: Safetensors and beyond. Tune in to understand the distinctions.

safetensorsASRspeech recognitioninferenceweight formats

Building Your Own Whisper

Ever wondered if you could build your own speech recognition tool? We dive deep into crafting custom ASR.

ASRspeech recognitionwhispermachine learningaudio processing