saddle point

Optimization algorithm for gradient descent method
For higher dimensional functions, most of the points with zero gradient (stopping points) are saddle points.
- 99.8% for 10 dimensions
- If you look at it from the perspective of maximizing utility function, the story goes like this
- Select the product with the largest utility function among those that are not too far removed from the current product.
  - = Based on the gradient of the point represented by the current product, update in the direction with the greatest gradient
- Repeat that and you’ll reach the saddle point.
- Utility decreases if you go in the direction you have been going by the time you come to the saddle point.
- We need to go in a completely different direction.
- NecessaryExperiments

This page is auto-translated from /nishio/鞍点 using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.

🪴 Quartz 4.0

saddle point

Graph View

Backlinks