Creating a character-based language model is useful in many ways I wonder if anyone has published the models they made because it’s a pain in the ass to make them…
- How to create a character-based language model - Ahogrammer
- Using the Wikipedia corpus [en.text8
- 50 dimensional embedding and then 75 dimensional LSTM.
- Use for Elmo and others ELMo to acquire context-sensitive word representations - Technical Hedgehog
This page is auto-translated from /nishio/文字ベース言語モデル using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.