#267: Decoding the Transformer: From Attention to Inference
Herman and Corn dive into the mechanics of transformer inference, exploring how models turn massive matrices into meaningful conversation.
transformer-architecturekv-cachingdecoder-only-models