Transformer Explainer

thecatsaton
Generating...
Temperature
1.0Balanced

Processing Input

Step 2 of 5: Attention

Input Embeddings
Complete
Attention
Generate Q/K/V matrices
• Project Query vectors
• Project Key vectors
• Project Value vectors
Compute attention scores
Apply attention masks
Combine attention heads
Add & Norm
Pending
Feed Forward
Pending
Output
Pending

Generating Prediction

Input Embeddings + Position
Masked Multi-Head AttentionProcessing...
Add & Norm
Feed Forward Network
Linear + Softmax
Built with by NexGen Technologies
© 2025 NexGen Technologies