Vector Normalization May 05, 2026 Generalization Statistics Dataset normalization & uses including: LayerNorm, Dynamic Tanh, BatchNorm, RMSNorm, L1, & L2 Read more →
L1 Lasso vs L2 Ridge Regression May 01, 2026 Generalization Statistics L1 Lasso vs L2 Ridge Constraint's Effect on Weight Minimization Read more →
L1 Lasso Regularization & Sparsity April 28, 2026 Generalization Statistics $L(w) = ||y-Xw||^2 + \lambda ||w||_1$ L1 Lasso Regularization & Weight Sparsity Read more →
Bias Variance of L2 Ridge Regression April 24, 2026 Generalization Statistics How $\lambda$ affect the Bias & Variance for Ridge Regression? Read more →
Decomposition of L2 Ridge Regression April 23, 2026 Generalization Statistics What happens when we increase $\lambda$ in $L(W) = ||y - Wx||^2 + \lambda ||W||^2$ Read more →
L2 Ridge Regression from Lagrange Multipliers April 21, 2026 Generalization Connecting the Ubiquitous Ridge Regression $L(W) = ||Wx-b||^2 + \lambda ||W||^2$ to its Lagrange Multiplier Formulation Read more →
Properties of L2 Loss April 16, 2026 Machine Learning Foundations Generalization Informative Properties of the Ubiquitous L2 Squared Loss Read more →
Bias Variance Tradeoff March 30, 2026 Generalization Bias Variance Decomposition of Squared Loss Read more →
VC Dimension via Pascal's Triangle March 27, 2026 Generalization What's the Maximum Number of Points a Model can Classify? Understanding VC Dimension via Pascal’s Triangle Read more →
Why is Learning Possible? March 21, 2026 Generalization Feasibility of Learning via Hoeffding's Inequality Read more →