Autograd is PyTorch’s automatic differentiation engine that powers neural network training. It allows you to automatically compute gradients of tensors involved in computations, making it easier to implement backpropagation for deep learning models.
To track operations on tensors for gradient computation, set requires_grad=True
when creating them.
import torch
x = torch.tensor([2.0, 3.0], requires_grad=True)
y = x * 2
z = y.sum()
print("z:", z)
Use backward()
on a scalar output to compute gradients of all tensors with requires_grad=True
.
z.backward()
print("Gradient of x:", x.grad)
x
is a tensor with gradients enabled.z = sum(x * 2)
creates a computation graph.z.backward()
computes dz/dx.x.grad
now holds the gradients: [2.0, 2.0].Use torch.no_grad()
or detach()
to temporarily turn off gradient tracking (useful during inference):
with torch.no_grad():
y = x * 3
# OR
y = x.detach()
Gradients are accumulated by default. Use zero_()
to reset gradients between training steps.
x = torch.tensor([2.0, 3.0], requires_grad=True)
y = x * 2
z = y.sum()
z.backward()
print(x.grad) # First backward
# Clear gradients before the next backward pass
x.grad.zero_()
z = (x * 3).sum()
z.backward()
print(x.grad) # Second backward
Every tensor with requires_grad=True
is part of a computation graph that tracks operations to compute gradients. You can inspect a tensor's creator function using .grad_fn
.
print(z.grad_fn) # or similar
import torch
x = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)
y = x * x
z = y.mean()
z.backward()
print(x.grad)
x = torch.tensor([5.0], requires_grad=True)
y = x * 3
z = y + 4
print(z.grad_fn)
print(y.grad_fn)
x = torch.tensor(1.0, requires_grad=True)
y = x * 2
z = y * y
z.backward()
print(x.grad)
import torch.nn as nn
x = torch.tensor([0.5, 1.5], requires_grad=True)
target = torch.tensor([1.0, 2.0])
loss_fn = nn.MSELoss()
loss = loss_fn(x, target)
loss.backward()
print(x.grad)
x = torch.tensor([2.0], requires_grad=True)
y = x * 5
with torch.no_grad():
z = y * 3
print(z.requires_grad) # False
# Or using detach
z = (y * 3).detach()
print(z.requires_grad) # False
x = torch.tensor([1.0, 2.0], requires_grad=True)
y1 = x * 2
z1 = y1.sum()
z1.backward()
print("Grad after first backward:", x.grad)
x.grad.zero_()
y2 = x * 3
z2 = y2.sum()
z2.backward()
print("Grad after second backward:", x.grad)
from torch.autograd import Function
class MultiplyByTwo(Function):
@staticmethod
def forward(ctx, input):
ctx.save_for_backward(input)
return input * 2
@staticmethod
def backward(ctx, grad_output):
input, = ctx.saved_tensors
return grad_output * 2
x = torch.tensor(3.0, requires_grad=True)
y = MultiplyByTwo.apply(x)
y.backward()
print(x.grad)
PyTorch’s autograd system simplifies gradient computation, enabling efficient model training and backpropagation. By understanding how to control and inspect gradients, you gain deeper insight into how models learn from data.