LayerNorm vs. RMSNorm: Why RMSNorm is Enough
LayerNorm centers and scales, RMSNorm only scales. A geometric explanation shows why centering is dispensable without losing model quality.
02/15/2026
LayerNorm centers and scales, RMSNorm only scales. A geometric explanation shows why centering is dispensable without losing model quality.
transpose() makes tensors non-contiguous (logical ≠ physical order). view() only works with contiguous tensors → RuntimeError. Solution: use reshape() instead of view().
How I fine-tuned a LLAMA 3.1 model on Swabian — data preparation, QLoRA training, and evaluation from the real world.
A beginner-friendly guide to neural networks: from individual neurons to a complete Python model that predicts house prices using NumPy.
An introduction to linear regression: model definition, cost function, and gradient descent, explained with a simple example of students and their study hours.