image - Short extracts from books - Model of what might be appropriate as a [boundary - This should have been “after the punctuation mark” without the need for machine learning.

  • Model whether it is appropriate as a cutout target
    • Less than a certain length
    • No mismatch in the number of corresponding items (e.g., parentheses)
  • That’s what’s missing.
    • Sometimes demonstrative or conjunction is added at the beginning of a sentence to refer to a previous sentence.
    • I can use pattern matching to remove the one attached to the beginning.
    • It’s a little difficult to have directives in a sentence.
    • Later, it’s also difficult to find cases where there is no meaning left if you remove it.

This page is auto-translated from /nishio/短文切り出しモデル using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.