• The model is that chatbot does not generate the conversation, but simply quotes a passage from a book or other source.

2018-09-30

  • I was trying to do with learning.
  • Changed the rule base to separate by brackets, curly braces, punctuation, etc.
  • After all, the most frequent bad pattern is that ā€œbroken sentencesā€ are produced, so I just used punctuation marks to separate them.
  • If anything, it would make more sense for seq2seq to take this ā€œone sentenceā€ as input data and output it by shaving off the indicator words, etc.
    • Leave it bare as the initial value.
    • Word-based input, output empty string when word to be deleted
    • I mean, would you prefer a binary classification of ā€œbare or deletedā€?

  • Model of the quotation itself
  • Model of citation starting point
  • Model of Citation End Points

feature value

  • Contains key phrase 1/0

  • The word boundary, 1/0

  • It is of moderate length. What is the definition of moderate?

  • Punctuation immediately preceding 1/0

  • The model for the start and end points should be something that can be computed using only local features.

    • Context of surrounding few words
  • The appropriate length of the quotation can be made as a model of the quotation itself

  • The model of the goodness of the quotation itself is made by LSTM.

  • We can use both as probability models and then multiply them together.


This page is auto-translated from /nishio/ē›“ęŽ„å¼•ē”Øćƒ¢ćƒ‡ćƒ« using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. Iā€™m very happy to spread my thought to non-Japanese readers.