I want to do a vector search of Scrapboxes of people who don’t publish a vector index.

  • Not just Scrapbox, but blogs and the like, right?
  • In the first place, not only digital books, but also paper books, right?
  • We need a box that allows us to plug in anything we are interested in and make it vector searchable.

from /villagepump/2023/03/23 from Mapping Scrapbox rows to Kozane in Kozaneba

  • I’d love to vector search Scrapbox for people who haven’t created a vector index.
    • shokai or rashitamemo.
    • I think it could be made on the same principle as well backups are made on Github…
    • I mean, I wish my project was updated once a day.
    • And while we’re at it, here’s a summary of the updated section…
    • → Vector search across multiple projects
    • Porting to Python with GPT-4
    • It’s done.
    • I guess we can now take data from anyone’s project…
    • So what did I want to do with it (far away)?
    • I’m tired today, so let’s call it a day.
  • You can use this script to both automatically update your own project daily and take someone else’s project, but they’re two different things to begin with.
    • I don’t need every day except for my own projects.
  • It’s convenient to be able to search vectors and ask questions, but it’s inconvenient to have to sit in front of a computer.
    • The problem is that the index file is too big when trying to deploy it to use from a phone.
    • Hmmm, I guess that would be Pinecore…
  • LLMs can submit search queries.
    • Experiments have actually been done to receive it and search Wikipedia.
    • Then, when you read Scrapbox and see the link notation, you can make the query ā€œRead that linkā€.
    • A system that utilizes the link structure of Scrapbox

This page is auto-translated from /nishio/他人のScrapboxć‚‚ćƒ™ć‚Æćƒˆćƒ«ę¤œē“¢ć—ćŸć„ using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.