Date: August 31, 2025

Topic: Computation Graph

Recall

Computation graphs allow us to make use of hidden vars. These vars represent the intermediate edges in the graph.

When calculating gradients in the graph for the chain rule, we use local gradients which exist at each graph node.

These are the derivatives of the output edge wrt. the input edge.

Notes

Computation Graph

Models functions into its intermediate steps

image.png

Making Use of Intermediate Variables

image.png


In forward-mode, the gradients are passed forward through the network.

In reverse-mode, the gradients start at the output instead and flow back to the input.

Automatic Differentiation (Autodiff)

Forward-mode Autodiff

image.png

Reverse-mode Autodiff

image.png


In Forward Mode, we continuously apply the chain rule on the Next Forward Gradient

This is useful for small inputs β†’ large outputs, as the Jacobians we compute from the small inputs is easier to deal with. The Jacobian can be simultaneously solved.




<aside> πŸ“Œ SUMMARY: Auto-differentiation uses DAGs, where we perform pairwise multiplication on primitives of differentials to calculate gradients.

</aside>