Recent Interests: LLM, Deep Learning, PyTorch
Exploring the geometric intuition behind gradient descent.
Deep dive into Sparse Mixture of Experts models and why training big then pruning actually makes sense.
An introduction to this blog and what you can expect to find here.