Log-Derivative Trick | abstractions

Ok this one is a quick one. I am going to mainly use this page just to refer to it when going over Likelihood Ratio Trick and sampling being non-differentiable.

Suppose we want to take the derivative of $log p_{\theta}(x)$ with respect to $\theta$ . Let’ go to high school for a second.

\nabla_{\theta} \log \ p_{\theta}(x) = \nabla_{\theta} p_{\theta}(x) \ . \ \frac{1}{p_{\theta}(x)}

Just rearrange the terms and we are done:

\nabla_{\theta} p_{\theta}(x) = p_{\theta}(x) \ \nabla_{\theta} \log \ p_{\theta}(x)

Just remamber the last equation. We will use it a lot.