Upcoming
1- Likelihood Ratio Trick a.k.a. REINFORCE a.k.a. score function estimator
2- Gumbel-Softmax Estimator
3- Sampling is not differentiable, at least not always.
- Why we cant back-prop through sampling? (math)
- Reparametrization Trick of VAE
- Need a simple example
4- Entropy, varentropy, crossentropy
5- Entropix: Entropy based sampling in LLMs
6- Q-learning vs Value Function Iteration and PG vs Hotz Miller CCP
7- Efficient serving of LLMs and implementations
8- Attention types and implementations
9- Traditional ML and implementations
10- Causal Inference and ML