We conducted an experiment to increase the number of data to be searched for in Extending the Red Link with AI, which we have been running publicly on this Scrapbox.

image

  • The page is automatically generated by AI at the destination of the red link created like this

mounting

  • Load from local pickle
    • It takes about 5 seconds to read 100,000 records
  • About 7 seconds to perform a local vector search
  • Itā€™s slow when you think of it as a web app response, but with the recent ā€œthrow a keyword or page that comes to mind, work on something else, and look at it after a whileā€ style, itā€™s not a problem.
    • Thereā€™s about 10-25 minutes between when you throw the query and when I come back to check the results (I donā€™t measure it).
    • Iā€™ve been updating every 10 minutes to begin with, and I havenā€™t seen a problem.

impressions

  • Do we really want to do it with ā€œall dataā€ or ā€œall except this project /nishioā€?
    • Generated using the keyword Combining knowledge is the source of new ideas, which in turn generates further knowledge and new combinations, which I already did in this project in my first experiment, I excluded it because it was not interesting because it hit almost all the pages of my project.
      • titles: ["šŸ¤–šŸ” successful intelligence", "šŸ¤–šŸ” successful intelligence", "šŸ¤–šŸ” twist", "Hatena2015-01-07", "šŸŒ€Collaboration with AI", "Increase personal productivity first", "šŸŒ€successful intelligence", "tkgshn/ Karabiner not working on macOS Monterey"]
        • Maybe the last one is random.
    • It may be possible to find unexpected connections by not excluding my own writing, but since I am the one who makes the query, it may be true that it will be more similar to what I have written.
    • I wonder if people find value in ā€œLetting my data-derived AI live in my wikiā€ because it gives them a ā€œdifferent perspectiveā€.
    • Not sure why I see the value of different perspectives in my own search from my own Scrapbox.
      • Is it really A?ā€ in response to ā€œIt is Aā€. or maybe the same individual has a sufficiently varied point of view due to the accumulation of more than 10 years of experience including Hatena Diary, or maybe I am highly sensitive to read differences by thinking ā€œ[Similarity ā†’ What is the difference?
  • If you exclude yourself, it looks like this (long story, so I cut out some unnecessary details)
    • titles: ["motoso/discussion"], "tkgshn/knowledge and wisdom"], "nishio-books/MOT knowledge creation management and innovation Ikujiro Nonaka"], "tkgshn/knowledge tightly coupled with business"], "mtane0412/suggestion through ignorance"], "nishio- books/idea generation method and cooperative work support Jun Munemori", "tkgshn/knowledge", "nishio-llm2023/Study of methods and systems to support knowledge creation process", "blu3mo_filtered/dualism of knowledge and ability", "motoso/wise company", "blu 3mo_filtered/accounting systems", "tkgshn/vaccine against false information", "motoso/Think different"]
    • Itā€™s interesting that the sources are pretty disparate.
    • But this variance makes the current prompt less interesting to say.
      • Notebooks and ā€¦ fragments are related in that the combination and exchange of knowledge generates new knowledge and ideas.

      • Iā€™m mentioning three of them, not all of them, so it would be more interesting if you talked about how they relate to each other individually.
      • Iā€™ve already tried an improved prompt in nues implementation of this, so Iā€™ll be importing it back in the future.
  • Later, I tried other styles, such as having multiple lines of text commented on instead of a single line of title, but the search results themselves are more interesting than the generated text
    • Itā€™s not that the generated sentences are not interesting, itā€™s simply interesting to see the results of the vector search
    • If the interestingness of the generated results when done with my projectā€™s data is 100, the interestingness in this experiment is 90, and the vector search results are about 300
      • Iā€™m writing a summary of todayā€™s experiment first, holding back the urge to write about each of them individually.
      • I think the individual stories are overly detailed.
        • Well, if you can just imagine the book part as ā€œa system where books pop up and related pages open when you talk about them in front of a bookshelf,ā€ you can understand how interesting it would be.
        • In the case of Scrapbox, the other person is still alive (and books can be alive, too, of course), so there is a possibility that the connection found by the AI here could lead to more conversation with that person afterwards.
    • So, the vector search results that were only output to the console are now written to Scrapbox as well.
      • This naturally led to parts of books and such being written in Scrapbox, making it impossible to publish directly.
      • Well, if we were originally going to do something like this, it was within the realm of possibility that we wouldnā€™t be able to do it in a public forum, so that was the default. - Eventually, you wonā€™t be able to see it.
    • The question of whether to include this part of the search results in the AIā€™s search is a difficult one.
      • Effect of increasing number of fragments related to interests
      • The effect of interferences between fragments of search results for the same keywords that happen to be next to each other, creating new ones.
      • Adverse effects of miscellaneous machine-generated data returning to input
      • Include them in the list once, and make a mechanism to mechanically remove them all together when you think itā€™s a bad idea.
        • 2023-10-05 I think itā€™s not good, the mismatch between the title and the content is confusing

summary

  • Just a casual post on a private project with a smartphone or something, and you can now see the results of vector searches across multiple peopleā€™s Scrapbox projects, books and papers.
  • We will improve it as we use it in the future.
  • It would be interesting to take not only the explicitly triggered ones, but also the ones taken from the recent updates of your project once a day or so, and generate them on your own.

This page is auto-translated from /nishio/ęØŖꖭ惙ć‚Æćƒˆćƒ«ę¤œē“¢å®ŸéØ“ćƒ”ćƒ¢2023-09-20 using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. Iā€™m very happy to spread my thought to non-Japanese readers.