Skip to content

Upcoming

1- Likelihood Ratio Trick a.k.a. REINFORCE a.k.a. score function estimator

2- Gumbel-Softmax Estimator

3- Sampling is not differentiable, at least not always.

4- Entropy, varentropy, crossentropy

5- Entropix: Entropy based sampling in LLMs

6- Q-learning vs Value Function Iteration and PG vs Hotz Miller CCP

7- Efficient serving of LLMs and implementations

8- Attention types and implementations

9- Traditional ML and implementations

10- Causal Inference and ML


Previous Post
Softmax to Gibbs