- Short extracts from books - Model of what might be appropriate as a [boundary - This should have been “after the punctuation mark” without the need for machine learning.
- Model whether it is appropriate as a cutout target
- Less than a certain length
- No mismatch in the number of corresponding items (e.g., parentheses)
- That’s what’s missing.
- Sometimes demonstrative or conjunction is added at the beginning of a sentence to refer to a previous sentence.
- I can use pattern matching to remove the one attached to the beginning.
- It’s a little difficult to have directives in a sentence.
- Later, it’s also difficult to find cases where there is no meaning left if you remove it.
This page is auto-translated from /nishio/短文切り出しモデル using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.