Einsum, Deriving the Gradient for the Backward Pass
Obtaining the gradient of the matrix inverse
Obtaining the gradient of the matrix inverse
Obtaining the gradient of the matrix inverse
Obtaining the gradient of the Cross-entropy loss (softmax and negative log-likelihood loss function
Deriving the gradients for the backward pass for matrix multiplication using tensor calculus
How does tensor parallelism work?
Deriving the gradient for the backward pass for the linear layer using tensor calculus
Obtaining the gradient of the layer normalization layer
Deriving the gradient for the backward pass using tensor calculus and index notation
A quick intro on backpropagation and multivariable calculus for deep learning