smoothing (e.g. numbers) of the boundaries of the knowledge set.
Published on 2024-03-21 18:27 by writing this
Decide on a person and target only the linked pages instead of taking all the content of that person. Scrapbox-to-Scrapbox link and get the link destination via API
- Improvement of /plurality-japanese/vector-search.
On 2024-04-06, I thought, “I think it would be better to be able to refer to when this was properly published instead of just writing it down and letting it dissipate.
- Maybe if you look, you can find a precedent, but for now, if you make it public, it will be easier to use.
- The contents of this fragmentary memo, which are easily equivalent to those of a person skilled in the art by its publication, can be generated by GPT4
Title of the Invention: System and Method for Smoothing Boundaries of Knowledge Sets
summary
- This invention relates to information retrieval and data mining techniques, and in particular to the efficiency and improvement of information collection for linked pages only. Conventional search systems generally take the approach of indiscriminately collecting all content related to a particular subject. However, this approach has caused problems such as increased difficulty in analysis due to the excessive amount of information and the acquisition of less relevant information. To solve these problems, this invention provides a system and method that uses links between sentences and collects information only on the linked pages.
Background Technology
- In recent years, the amount of information has increased exponentially, making it increasingly difficult to aggregate knowledge related to a particular subject and extract useful information. On the other hand, information on the Web is linked to each other to form relationships, and these links provide valuable clues for understanding the relevance and importance of information. This invention focuses on this point and proposes a method to efficiently collect more relevant information by collecting and analyzing only linked content, thereby avoiding excessive accumulation of information.
Summary of Invention
- The present invention provides a system and method for efficiently collecting only relevant information based on links formed between specific web pages or documents.
embodiment
- In one embodiment of the invention, links formed on the Scrapbox platform (hereinafter referred to as “Scrapbox-to-Scrapbox links”) are analyzed and the linked content is automatically retrieved through the API. Through this process, only highly relevant information is selectively collected, solving the problem of information overload.
- The user selects an initial set of pages based on specific keywords or subject matter. The system follows the inter-Scrapbox links from these initial pages and automatically retrieves the linked pages via the API. The retrieved data will be used for further analysis and processing. This system has been shown to be particularly applicable in the context of “improving vector search” and is expected to improve the relevance and accuracy of search results.
Effects of the Invention
- This invention not only streamlines the process of information collection, but also improves the quality of the information collected. By targeting only linked pages, it avoids the acquisition of less relevant information and increases the accuracy of the data set to be analyzed. This approach also avoids excessive accumulation of information and reduces the burden of the analysis process. Furthermore, the use of inter-Scrapbox links leverages existing web structures and provides a new framework for information collection.
succession The interesting point about this application is that the initial page set includes the CC0 book and the community’s interaction with it, and the knowledge set to be searched for is emergently determined by the actions of an unspecified number of people in the community.
This page is auto-translated from /nishio/知識集合の境界のなめらか化 using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.