# Stochastic Gradient Descent With Momentum

Stochastic gradient descent with momentum uses an exponentially weighted average of past gradients to update the momentum term and the model's parameters at each iteration. It helps the optimizer maintain a more stable direction and speed up convergence.