In steepest descent methods of minimizing a function $f(x), x \in \mathbb{R}^d$,
it's common to approximate the gradient by finite differences: $\qquad\qquad \nabla f(x) \approx gradest( x; h ) \equiv { {f( x + h I ) \ - \ f(x)} \over h } $
then minimize along those directions:
$\qquad\qquad x_{k+1} \ \text{from min} \ f( x_k + \lambda \ gradest( x_k ) ) $
Can a signal-processing point of view suggest ways of smoothing $gradest()$ in this context ?
For example, one could view the "zig-zags" in the picture above as high frequencies in $gradest()$, to be smoothed out over the last few ${x_k}$. But that may be naive: can one model, then smooth, zig-zags through non-uniformaly spaced points in 5d or 10d ?
No comments:
Post a Comment