image This is what I got when I embedded the correspondence between hiragana and katakana (6 dimensions: hiragana/katakana, line, dan, voiced, half voiced, lower case) in 2 dimensions using t-SNE.

The strange bending is simply due to the t-SNE, which seems obvious when linearly reduced in dimensionality by PCA. image

The next day, 2018-12-01, it became 10 dimensional due to rotational encoding of lines and columns and the addition of the long vowel symbol.


This page is auto-translated from /nishio/ひらがなとカタカナの埋め込み using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.