-
Humans have difficulty imagining beyond 2-4 dimensions.
-
A lot of unexpected phenomena happen when you get dimensional.
-
In higher dimensional space, almost all points are far from the center
- A point within distance 1 from the origin in one dimension is half of a point within distance 2
- 1/4 in 2 dimensions
- 1/8 in 3D
- …and the “percentage of close points” decreases exponentially as the dimension increases.
-
The number of samples required for sampling increases exponentially.
- In the case of machine learning, increasing the dimensionality of the machine deteriorates the accuracy.
- Because the effect of insufficient sample size is more overwhelming than the improvement in accuracy due to the additional dimension.
Chi-square distribution - Wikipedia
-
For 3 or more dimensions, the vector length mode is non-zero.
-
Condition that each axis follows a standard normal distribution with mode 0
-
This is related to “most points are far from the center.”
-
1000000 If you want to find the percentage of samples with a cosine similarity greater than 1/2,
-
0.06 (about 1/17) in 10 dimensions,
-
0.01 (about 1/100) in 20 dimensions,
-
0.0021 (about 1/480) in 30 dimensions,
-
0.00042 (about 1/2400) in 40 dimensions
-
100 In 100 dimensions, there were no points in the 10,000,000 sampled points where the cosine similarity was greater than 1/2.
-
Of course, in two dimensions, 33%.
-
Cosine similarity of 0.2 is extremely rare in higher dimensions
-
-
relevance
-
If you take two random vectors in a high-dimensional space, the probability that they are nearly the same direction is very small compared to the probability that they are nearly orthogonal
-
As the number of dimensions (number of evaluation axes) increases, the probability of a state of complete superiority of one person’s skills over another’s decreases.
- 100% in 1D, 50% in 2D, 25% in 3D
-
Almost every stop is a [saddle point
- 99.8% in 10 dimensions
-
There are few cases where only one particular axis is larger than the other.
This page is auto-translated from /nishio/次元の呪い using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.