Embedded in vector Clustering into 2-5 pieces by k-means method - Select K for which silhouette coefficient is optimal. - Polis: Scaling Deliberation by Mapping High Dimensional Opinion Spaces

Not bad, isn’t it?

I would like to give an additional hint as to what we should call “each group”.

  • In addition to the page itself, the page title is also vector-embedded.
    • Maybe we could do the link strings used inside.
  • And the one closest to the representative point of each cluster is the “string representing the cluster”.
    • Maybe I can give you a Top 10 or so.

This page is auto-translated from /nishio/ベクトル埋め込みを使った大きすぎるリンクの分割 using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.