- [series labeling
- Eigen-Expression Extraction labels the âbeginning of the eigenexpressionâ and the ârange of eigenexpressionsâ.
- The RAKE stop list generation algorithm in keyphrase extraction counts the number of times a word is âinâ or ânext toâ a keyphrase.
- Mapping to series labeling would be labeled âkeyphrase rangeâ and âkeyphrase adjacencyâ.
- Personally, I think it would be better to distinguish between âright neighborâ and âleft neighborâ.
Which labeling is better?
- As for keyphrase extraction, I think itâs more straightforward to label around the latter keywords.
- What do humans do when they explicitly state that a sentence is indistinguishable from a ground sentence and âitâs a keywordâ?
- For example, enclose in brackets
- Conversely, even if other conditions are the same, the probability of a key phrase being a key phrase is naturally increased in the area enclosed by brackets.
- With the latter labeling, the â â token naturally corresponds to âthe label to the left of the keyword.
- What is the reason why eigenexpression extraction often uses the former labeling?
- Eigenexpression may be contiguous.
- The method of labeling the perimeter is impossible.
PS
-
I was comparing 2-1 and 3, but there are more detailed steps.
-
4 can identify consecutive keywords.
-
5 can distinguish âwords that do not appear at the end of keywords but often appear within keywordsâ such as âof
This page is auto-translated from /nishio/ĺşć襨çžć˝ĺşă¨ăăźăăŹăźăşć˝ĺş using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. Iâm very happy to spread my thought to non-Japanese readers.