- Cybozu Labs Study Session 2023-10-06
- AI started to write a report once a day in Scrapbox on 7/29 at Omoikane Project.
- It was two months of porting this to /nishio and improving it to the nth degree.
- My previous presentation at the study group, âOmoikane Study Groupâ (Aug. 4), was right after the start of the movement.
- Summarize what you experienced and felt during the two months
- Up to now.
- 2023-03-24 Story study session about connecting my Scrapbox to ChatGPT. Talking about a question-and-answer UI for vector searching Scrapbox.
- 2023-08-04 Omoikane Study Group
-
- thumbnail (i.e. miniature image)
- 1: Current Topics: Azure Cognitive Search
- 2: Vector Search
- 3: Omoikane Embed
- 4: Intelligent collaboration between AI and humans
- 5: Extending the Red Link with AI
- 6: Raw ChatGPT and omni use cases are different.
- 7: Thoughts on using a private OMNI
- 8: Vector search is an opportunity to cut out
- 9: Vector search serves as a tool for cognitive resolution.
1: Current Topics: Azure Cognitive Search
- 9/18 Azure Cognitive Search: Outperforming vector search with hybrid retrieval and ranking capabilities
- src
- Talk about combining both and reranking (âreordering after looking at the search results) is far better than the common âkeyword onlyâ search, or vector search.
- Itâs been known for a while that âitâs better to combine and rerank.â
- from Combine Searches
- AI King - Japanâs No.1 Quiz AI Competition and won 3rd placeïœPKSHA Delta 2023-01-04
-
- Iâm combining DPR and BM25.
- DPR is a messy vector search.
-
Dense Passage Retriever (DPR)⊠Dense vector-based using pre-trained models
-
- BM25 is, in a crude way, âmodern TF-IDFâ BM25: Lexical Match-based.
- Very small overlap in search results for both DPR and BM25
- â Good of both.
- Vector search is available in:.
- Strong semantic similarity
- Missing low-frequency words / Significant performance degradation outside the distribution
- Weak on low-frequency words (=weak on proper nouns, technical terms, and product names), so need to combine normal search
- FiD=Fusion In Decoder
- A text generation system that reads 100 search results
- DPR is a messy vector search.
-
- AI King - Japanâs No.1 Quiz AI Competition and won 3rd placeïœPKSHA Delta 2023-01-04
- from Combine Searches
- Vector search is good for ambiguous searches in natural language conversations because of the flurry of hits, but it has a high cognitive load for users because âthe name of the company you visited in salesâ or âthe product model numberâ also hits âdifferent things that look very similarâ.
- I was thinking that the groupware needs to improve this problem if it is going to use vector search.
- Azure Cognitive Search
- 7/18 Public Preview begins
- Iâm thinking of trying it soon.
- If you look at the configuration, BM25 + HNSW + rerank is a royal configuration.
- HNSW (Hierarchical Navigable Small World) is a vector neighborhood search method also used in Qdrant and Pinecore.
- What is âsearch?â
- What is the value you are creating for your customers?
- The âsearch UIâ we provide to users now is not the best way to deliver that customer value.
- Relanker â âa midget who looks at 200 search results and sorts them in good order.â
- Fusion in Decoder â âa little guy who reads 100 search results and writes sentencesâ
- The era in which these things are becoming available
- Especially in the reranker section, a style that takes into account things other than âtext explicitly entered by the user,â such as operations on the groupware immediately before, may lead to differentiation.
2: Vector Search - Vector Search in Nishio - Vector search for this Scrapbox project, published 2023-06-05
- For me, the usefulness of vector search is empirically evident, but perhaps I should present an example for those who have not yet experienced vector search
-
Case 1
- Itâs all about âsocial security spending.â
- I recall a past lecture in which he said, âIâve talked about the high cost of Social Security.â
- The focus was on the ratio of investment to science and technology.
- Iâd try to find it, but I canât find it by searching for âsocial security paymentsâ.
- Vector Search in Nishio for âSocial Security Funding Scientific Researchâ hit
- The expression was â38 trillion for medical care, 24 trillion for welfare and others, and 120 trillion in total.â
- Thereâs also âScience and Technologyâ a little further down the road.
- I made an excerpt page with âSocial Security Expensesâ in the title. - Comparison of Social Security Expenditures and Science and Technology Expenditures
-
Case 2
- Iâm trying to come up with a story about a hole in the wall, and Iâm searching for wall or hole, but I canât find it.
- Vector search for âhole in the wall, you have to get close to see it.â
- Second candidate found.
-
- [The fact that there is a wall ahead is not a reason not to proceed. https://scrapbox.io/nishio/%E9%80%B2%E3%82%80%E5%85%88%E3%81%AB%E5%A3%81%E3%81%8C%E3%81%82%E3%82%8B%E3%81% 93%E3%81%A8%E3%81%AF%E9%80%B2%E3%81%BE%E3%81%AA%E3%81%84%E7%90%86%E7%94%B1%E3%81%AB%E3%81%AF%E3%81%AA%E3%82%89%E3%81%AA%E3%81%84]
- Itâs a match between âyou canât see it unless you get closerâ and âyou could see it if you got closer.â
- The nuance of not being able to see the hole in the wall unless you get close to the wall.
- [The fact that there is a wall ahead is not a reason not to proceed. https://scrapbox.io/nishio/%E9%80%B2%E3%82%80%E5%85%88%E3%81%AB%E5%A3%81%E3%81%8C%E3%81%82%E3%82%8B%E3%81% 93%E3%81%A8%E3%81%AF%E9%80%B2%E3%81%BE%E3%81%AA%E3%81%84%E7%90%86%E7%94%B1%E3%81%AB%E3%81%AF%E3%81%AA%E3%82%89%E3%81%AA%E3%81%84]
- The hole in the wall was pictured, but not texted.
- Iâll add a note.
- Q: The search is not looking at a picture, is it? (Confirmation)
- A: I havenât seen it.
-
Case 3
- Vector search on âunfeasible ideas seem originalâ.
- The one I was looking for was the fifth hit.
-
A laymanâs idea may appear original in an oral presentation, but it is not feasible.
-
This paper compares amateurs who have never used Mindstorms with robot contest winners and runners-up (hereafter referred to as âexpertsâ) in the task âBuild a robot that makes creative progress with Lego MindstormsâŠâ⊠There was no difference in originality between the genin and the amateurs⊠However, most of the amateurs failed to realize their ideas.
- Didnât hit on the swing between âoriginalâ and âoriginality,â between âunfeasibleâ and âunfeasible.â
- When we try to specify the search target with search keywords, we tend to use noun form keywords such as âunfeasible,â but in reality, they are sometimes described by opening them to verb form.
-
- There is a lot of very relevant content in the candidates above this one: result
- The pages of this experiment are episodic, which would make them memorable and easily recalled.
- I created a new page with â[Unfeasible ideas seem original.
-
Case 4
- Vector search on âkintone talks around dataâ
- No.1:.
-
I think the ideal situation would be to have the same collaborative editing capability for database schema and customization JavaScript.
-
- No. 4: The
-
Reforming Municipal Operations with Digital Solutions for Public Utilities
-
Both municipalities used Cybozuâs âkintoneâ as a digital solution. kintone allows users to build business applications without programming knowledge, as long as they have a working knowledge of Excel. It allows you to integrate email, Excel, and disparate information and create a unique business environment with your own ingenuity.
-
I see - I was wondering how to explain kintone, but you explain it as âemail and Excel integrationâ.
- I like the term âemail and Excel integration.â
-
-
-
Case 5
- âBias to measure only what is easy to measure.â
-
While cost is easy to measure numerically, quality is harder to measure and therefore easier to underestimate.
-
- âBias to measure only what is easy to measure.â
-
Q: If I do a vector search, will everyone get better at searching?
- A: Better to search by keywords if the keywords are known, even if the vector search has good features, it is not possible to âread the mindâ, so the searcherâs linguistic ability is necessary
-
This kind of story feels magical when you hear only (or more) examples of it going well, but you have to be careful (and self-conscious) because itâs often just as (or more overwhelmingly) the case that it actually didnât go well.
- Of course it is. The service is open to the public, so try it yourself: Vector Search in Nishio.
-
Looking Back
- Itâs now possible to search by âthis is what I meantâ instead of âI must have written this âstringâ.â
- Nishio himself is adapting to the vector search system.
- Early cases search by fragments, as in âsocial security funding scientific researchâ and âhole in the wall, you canât see it unless you get up close.â
- As I get used to it, Iâm searching for âshort sentences that express meaningâ like âunfeasible ideas seem original,â âkintone is a conversation around data,â and âbias to measure only what is easy to measure.â
- The other day, when I asked a user to use this vector search while sharing the screen on Zoom, the user typed, âTell me about your experience of cooking for yourself,â and I thought, âOh, you donât understand how to use this at all.
- I donât know the difference between a vector search and an LLM that is Instruction Tuning.
- Maybe most people in the world donât know the difference.
- Only those that have undergone additional learning (Instruction Tuning) to âregard user input as instructions and follow those instructionsâ will follow the instructions in the underlying language model.
- Similar input just because ChatGPT does.
- I seeâŠit is different from ChatGPTâŠ
- The user then typed in âdescription about cookingâ and found my terrible cooking story, which is a good thing, right?
- Not much vector search goodness.
- Because the âmeaningâ in the query is in effect one word of âcookingâ, so it is no different from a keyword search.
- Now when I search for âI made a terrible dishâ lots of terrible past disheswww
- Not much vector search goodness.
- Q: Does vector search eliminate the need to add search keywords to absorb distortions and make it easier to get a hit?
- A: That would be unnecessary, but if you only use vector search, youâll end up needing a combined approach with keyword search like Azure Cognitive Search, because keyword matching is a poor search!
- On the other hand, Helpfeel is an approach to make the process of âadding more keywords to absorb the distortionâ more efficient with the help of a machine.
-
Our proprietary algorithm âIntent Predictive Searchâ⊠technology is further extended using the text-embedding-ada-002 model provided by OpenAI, Inc. of the U.S. Helpfeel releases âContact Sense AIâ that improves search accuracy by 30% while avoiding AI halcination. Facilitates Self-Resolution of InquiriesïœPress Release from Helpfeel, Inc.
- Q: When combined with a keyword search, are the search queries different from those for a vector search?
- A: No, BM25 can search by sentence, so itâs not that different. Maybe you have the old image of searching by word and returning documents that contain that word, but BM25 is closer to âsearching by all words in the inputâ than to that image.
- Q: Iâd like to see you handle polysemy as well (is it too hard with the current vector search?).
- A: I think that the idea of âpolysemyâ itself is stuck in the âold image of searching by wordâ. Nowadays, it is possible to create an embedding vector from an entire sentence, so you can write a sentence so that the meaning of the sentence is unique.
- A: That would be unnecessary, but if you only use vector search, youâll end up needing a combined approach with keyword search like Azure Cognitive Search, because keyword matching is a poor search!
- Q: Is there both an effect of finding what you want to find but canât find with keywords alone, and a serendipitous, unexpected encounter with information?
- A: Yes, there is. Even if you have a goal and find it, it is interesting to read the surrounding fragments for unexpected thought provocation. Example: search results for âunfeasible ideas seem ingeniousâ
3: Omoikane Embed - Omoikane Project 6/4~
- from [/omoikane/Omoikane Embed](https://scrapbox.io/omoikane/Omoikane Embed)
- Mechanism for creating vector index for [Omoikane Vector Search
- Github Action at 6:00 a.m. Japan time
- Automatically export JSON from Scrapbox
- Vector embedding by inscribing on 500 tokens
- Upload to [Qdrant
- Began operation on 6/9, now running daily.
- I started writing reports in Scrapbox on 7/29.
- The previous Omoikane Study Group (8/4) was just starting to get up and running.
- 8/9 Organized code to make it easier to put into other projects.
- Mechanism for creating vector index for [Omoikane Vector Search
4: Intelligent collaboration between AI and humans
- 8/16 Intelligent collaboration between AI and humans
-
- AI writes a note in Scrapbox, AI reads the note the next day, and writes another note.
- Repeat intellectual production activities in the same way as humans in the collaborative editing space called Scrapbox
- Implemented on 8/12, - AI writes research notes daily - Co-operation with AI
- Wrote âIntelligent collaboration between AI and humansâ on 8/16 and spoke at 8/20 (Unexplored Junior Interim Camp) and 8/21 (Lab Youth Camp).
-
- 8/21 No human has to pull the trigger.
- AIâs once-a-day note-with-comments communication style has the advantage of âno human triggers to pullâ.
- Even when humans are too busy to write comments, AI goes ahead on its own.
- Even if you write some comments, there is no âsendâ action.
- I may come up with something after that and add more.
- No need for human decision-making on when is âthe endâ or âcompletionâ.
- Not sure if it would be better without the âsendâ action.
- Personally, I think itâs better to have it, and can be accomplished by running scripts in the development environment when youâre in front of a PC.
- Iâm on the train to get to the camp site, and I feel like triggering from my phone in these cases.
- â Later that function was added (Pioneer Mode).
- AIâs once-a-day note-with-comments communication style has the advantage of âno human triggers to pullâ.
- 8/21 I want to fork a page.
- â8/21 multi-head / page memory
- from page memory
- AI created a style of adding to the same page instead of creating a new page.
- After that, the style of creating new pages was DISABLED (8/31 main branch stopped).
- This allowed âmultiple topicsâ to develop in parallel
- Write a note that a human thinks âIâd like to develop this topicâ as a seed and treat it as an AI note (with đ€đ in the title in the implementation at the time).
- â Multi-head thinking
- Talk if you can afford it (I canât afford it)
- 8/21 page memory â8/23 Ignore used pages from all history â8/26 Pinning effect on the topic.
- 8/25 Update page limit â 8/31 Update Interval of AI Notes.
- Changed the method of addition: 9/11 Difference between Recurrent Notes and Iterative Commenter.
5: Extending the Red Link with AI
- 8/30 Extending the Red Link with AI
- from Extending the Red Link with AI - AI writes research notes daily Useful use cases for the system that were not originally intended.
- History
- If you create a red link (an empty link) and then specify the red link, the page is now created using the results of a vector search on the link title.
- This is useful
- What kind of experience does it bring?
- Make a link to âI think Iâve written something like this before.â
- Sometimes assisted here by Scrapboxâs link suggestions
- For those not familiar with Scrapbox: Scrapbox runs an ambiguous search when creating a link and suggests a link destination.
- If not, you had to use search and other methods to find them.
- We can leave that to AI.
- What kind of experience does it bring?
- Further Development
- When there is a good âlong phraseâ in the AIâs generation or in the past descriptions that the AI has uncovered, make it a link, like a marker.
- Naturally, it would be a red link.
- This is long enough that it is expected that âit will not be a natural connectionâ
- So we will âred-link extensionâ this.
- Until now, Expansion of pertinent judgment was done with the operation Long title engraving page.
- This is useful
- Specific example 1
- đExchange Form D, omni âreaffirmed the importance of exchange in the problem-solving process.â
- I didnât know what âthe importance of exchange in the problem-solving processâ was, so I red-linked it and stretched it.
- When OMNI said, âProblem solving bridges the gap between the ideal and the current situation, and the exchange of information in the process is important,â I understood, âI see, you mean exchange of information is exchange.â
- I wrote about this realization in [Information exchange is exchange
- (The meaning of this realization is probably not conveyed, but that is a case study for later, so I wonât explain it here.)
- After a while, â[Is the exchange style of knowledge exchange A?â The question arose
- I extended the âPublic as the object of the giftâ that was created there as a red link, and it gave me a lot of examples.
- It can be called âsearch with descriptionâ.
- Search with explanation of why it was shortlisted.
- Instead of a human directly reading the search results, the AI reads them first and writes a description.
- 9/2 Difference in skin feel between vector search and RAG
-
kazunori_279 I guess instead of messing around with fine tuning and RAG, we can simply do a vector search with emb and display the results. I thought, âWhy not? I thought so.
-
nishio I have always been of the âjust do a vector searchâ school of thought, but I feel that RAG is better when used as an intellectual production assistant rather than just a question and answer tool. I feel that RAG is better for use as an assistant for intellectual production, not just for answering questions. The generation part functions as âsummarizing the search results according to the purposeâ. This assumes that the purpose is given separately from the query.
- For a long time, I used âNishioâs Vector Searchâ and thought, âIsnât this just fine?â but now that âExtension of red link by AIâ is available, I think âthis one is betterâ.
-
- I find it different and more useful than a simple vector search.
- It could be called âdecoupling search and action.â
- Now âsearchesâ donât involve updating, so if I donât update my Scrapbox with the search results, the search never happened!
- Implicitly required that âhumans read the results and act on them before they forget they searched.â
- Search and action were coupled.
- AI extends the red linkâ is a human writes the âintention of searchâ and the AI does the âsearch, reads it, and writes a beat of explanationâ, so the human action after the search is optional.
- Now âsearchesâ donât involve updating, so if I donât update my Scrapbox with the search results, the search never happened!
- 9/4 Pioneer Mode
-
Pioneer Mode is a development of âExtending the Red Link with AI
-
If you put a link on the ââïžđ€â page, the AI will check it periodically and generate it automatically.
- I can now use it from my phone.
-
- Q: Is it possible to distinguish between the page that AI extended the red link and the page that I created? If not, do you feel comfortable mixing them?
- A: I didnât. I did at first, but I stopped.
- âWhich parts are AI and which parts are human?â The question is âWhich part is AI and which part is human?
- > When two people get along well, have a lively discussion, and their comments influence each other, it is difficult to discern which part of the resulting material originates from one person.
- > The same thing happens with humans and AI. The idea of trying to identify this, or to make it identifiable, can be detrimental to designing a UI.
- And I was writing my own.
- Especially on Scrapbox, the action of âread a page, and if the thought inspires you, write on that pageâ is afforded.
- Then, when a human sees an âAI-generated pageâ and makes a post, it seems strange that the humanâs post is not picked up because it was originally an AI-generated page, so the distinction between AI-generated and non-AI-generated pages is abandoned.
- For example, it is possible to know who updated each line, so if you create a dedicated account for the AI, it is possible to âignore AI-generated content and use only human-written contentâ.
- As to why I havenât done it, I donât feel the need to do itâŠ
- I feel like I might discover something new if I try it, but I have a lot of other things I want to try, so itâs not a priority and hasnât been started.
- On the other hand, I also explicitly think itâs a bad idea to write the vector search results directly into Scrapbox as AI output.
- Itâs OK up to the point where a human reads it, but once itâs in the search results again, thereâs a feeling that the value of the âtitle to content mappingâ is spoiled.
- from Why I stopped putting đ€ in the title of AI generated pages.
- We discussed removing the đ€ mark from the title of AI-generated pages. Initially, this was to prevent AI from reading AI-generated material, but during operation, it was felt that it was natural for humans to respond to AI-written pages. However, we questioned the fact that the content was not subject to being read by the AI. We also discussed the lack of assumption of a collaborative editing forum.
- That was a good summary.
- This one in the continuation is indeed a problem.
- This note relates to a fragment of Nishioâs research note, âItâs buried at the bottom of the AI page.â The idea that it is natural for humans to respond to AI-generated pages and the problem of human thoughts being buried at the bottom of AI-generated pages, closely related in terms of information transfer between AI and humans.
- This is how weâre dealing with that issue right now.
- Especially on Scrapbox, the action of âread a page, and if the thought inspires you, write on that pageâ is afforded.
- A: I didnât. I did at first, but I stopped.
- âWhich parts are AI and which parts are human?â The question is âWhich part is AI and which part is human?
- > When two people get along well, have a lively discussion, and their comments influence each other, it is difficult to discern which part of the resulting material originates from one person.
- > The same thing happens with humans and AI. The idea of trying to identify this, or to make it identifiable, can be detrimental to designing a UI.
- And I was writing my own.
6: Raw ChatGPT and omni use cases are different.
- 9/6 I realized I was unconsciously using ChatGPT and omni differently.
- So there is a difference in the utility that the two provide, what is the difference?
- from Raw ChatGPT and omni use cases are different.
- Since most texts in the world are written in âexpressions that many people can read and understandâ, whereas my research notes are written in âexpressions that I can understandâ, an AI that RAGs with the latter accelerates my personal thinking much more efficiently than ChatGPT
- In my research notes I donât write explanations for words I know, so the AI that reads them doesnât write explanations for what I know either. Concepts are tools for the economy of thought. So it is more efficient to use them without explanation in oneâs thinking.
- It is useful to use ChatGPT when explaining it to others
- iceberg model
- The omniâs output is closer to my personal water surface (the boundary of what is not yet verbalized), so it is more effective in supporting verbalization.
- Maybe it has to do with the different characteristics of blogs and Scrapbox: in Scrapbox, instead of explaining the same concept over and over again, you create a page for that concept and link to it.
- The description of the case study immediately above, âiceberg modelâ, is minimal.
- Case 1
- To improve knowledge productivity, we need to understand the importance of collaboration and the importance of sharing tacit knowledge prior to planning.â
- I can tell that this âcollaborationâ is âthe story of Ikujiro Nonakaâs SECI Modelâ. This can be read to compensate for a little poor Japanese.
- Please put in terms that the average person can understand, âIn order to improve knowledge productivity, we need to understand the importance of collaboration and the importance of sharing tacit knowledge prior to planning.â
- To use knowledge effectively to get things done, itâs important for everyone to work together, to share information, and for everyone to share their own experiences and tips before the actual work begins.â
- Most people will find it easier to understand.
- From my subjective point of view, it is clearly degrading. Maybe GPT4 doesnât understand the cooporation correctly.
- PS: Iâd like to explain this a little more.
- In Ikujiro Nonakaâs SECI model, âcollaborationâ refers to the sharing of âtacit knowledgeâ that is difficult to verbalize through âsharing the same experience,â such as through joint work, without the use of language.
- So I get the feeling that you âdonât get it at allâ when you describe it as âsharing information, experience, and tipsâ.
- In light of this, this statement can be interpreted as âbefore making a linguistic plan, it is necessary to first share a nonverbal experience through cooporation.
- And in the context in which this statement came up, we were talking about increasing intellectual productivity through AI.
- Before any linguistic planning can be done on how to improve intellectual productivity with AI, it must first be collaborative.
- This is the first step in the process. One form of this is to let AI live in the place where I usually do my intellectual production by creating a system that allows AI to work on its own, instead of humans giving it instructions every time and AI responding to them.
- In Ikujiro Nonakaâs SECI model, âcollaborationâ refers to the sharing of âtacit knowledgeâ that is difficult to verbalize through âsharing the same experience,â such as through joint work, without the use of language.
- To improve knowledge productivity, we need to understand the importance of collaboration and the importance of sharing tacit knowledge prior to planning.â
- Example 2 (âInformation exchange is an exchangeâ as you just said)
- I found âinformation exchange is exchangeâ to be an important realization, but I think most people probably donât know whatâs so interesting about it.
- When I see the word âexchange,â I connect it to the context of âYukito Emiyaâs interchange format.
- Very rough description.
- Exchange Form A: Giving a Gift in Return
- Exchange Form B: Obedience and Protection
- Exchange Form C: Pay the price and get the goods
- Information exchange is an exchangeâ is a realization that âsince information is also a good, information exchange can be considered within the framework of Gyoto Emoryaâs exchange theory.
- âSo [Is the exchange style of knowledge exchange A? âNo, I think information sharing using a place of sharing instead of DM is New Exchange Style because there is Public as the object of the gift, isnât it?â This led to the realization that
- Supplemental (if time permits)
- Technically, I think that by loading a lot of the prompts with RAGs, the âchat-nessâ is gone and itâs closer to a âsummary task,â and the pressure by RLHF to make the expression more palatable to the general public is down.
- I feel that the style of writing is a major factor in the speed of comprehension, and I have the impression that a sentence that is easier for me to read is generated when I give my own sentence to GPT to generate, rather than just having GPT generate the sentence for me.
- Indeed, it may be that OMNI is easier for me to read because it speaks in my style. In the end, it may be that AI assistants are more productive for each individual if they are personalized to the individual.
7: Thoughts on using a private OMNI
- 9/20 I started to create Private omni because I found the omni I was running on /nishio publicly useful enough.
- To include items (e.g., books) that cannot be placed in the public domain in the search results.
- 9/27 Thoughts on using a private OMNI
- Vector search alone is pretty interesting.
- If you imagine âa system in which books pop up and related pages open when you talk in front of a bookshelf,â you will understand how interesting it would be.
- If someone elseâs Scrapbox is hit, I read it interestingly, âMr. X, I didnât know you wrote about this,â which may lead to subsequent communication.
- Difference in sense of public and private omni
- Iâm still using it differently.
- So you must feel different utility, what is it?
- Feelings differ depending on whether the data is primarily self-inflicted or not
- When itâs self-derived, itâs like the AI and the human are driving the thinking as a unified entity.
- I feel like Iâm accelerating.â
- Feel like youâre âseeing things from a different perspective.â
- When derived from others, the feeling of âOh, so this is what Mr. X said about this subjectâŠâ
- I feel like I âfound someone elseâs statement.â
- case
- Should scientists who do basic research operate on their own dime, or should the government invest in them as public infrastructure that is underinvested if left to the market?
- The âanti-fragility sic sicâ fragment suggests the idea that government investment should be directed toward nonobjective activities, rather than research in general.
- Oh, I donât know if thatâs what youâre talking about, Iâll read it.
-
What the government should be spending money on is non-objective tinkering, not research --- Anti-Fragility top, p. 375
- Fragments derived from oneself are reconstructed once chewed up inside oneself, so there is a sense of smooth connection. Fragments derived from others still have a hard surface.
- However, it may be possible in the future for LLMs to do the âchewing and reconstructingâ, âmelting togetherâ, and âsmoothly connectingâ themselves by fine-tuning with their own origin data.
- 10/3 Teaching LLM to Knowledge Brewing in Scrapbox
- The process of chewing is not a knowledge problem but a question of what chunks to divide the input into and what format to use, so I feel fine tuning would be useful.
- However, it may be possible in the future for LLMs to do the âchewing and reconstructingâ, âmelting togetherâ, and âsmoothly connectingâ themselves by fine-tuning with their own origin data.
- Fragments of different viewpoints originating from oneself drive dialectic development more strongly than fragments of different viewpoints originating from others
- Is it because Iâm naive enough to pass off things of other peopleâs origin as âthatâs one way of thinkingâ even if it differs from my current opinion?
- The âitâs normal for others to have different opinionsâ runaround?
- If your self-derived opinion differs from your current opinion, since you are both you, âWhy do we disagree?â Since they are both me, does this strongly trigger the question, âWhy do we have different opinions?
- Q: By different viewpoints derived from yourself, do you mean that you think you are likely to think?
- A: âWhat I thought in the pastâ is closer to
- Q: I still wonder if this is a good idea because I keep âbooks Iâve read onceâ. Or is it useful if a phrase hits a vector search for a book Iâve never read?
- A: Once you read a book, you donât remember every detail.
- I started using vector search and it unearths things like blog posts I wrote 8 years ago, but I wrote them myself and I donât remember them.
- Even a book you read hits you unexpectedly and you say, âWhoa, did I read that in this book? Oh, it sure does! I didnât pay much attention to it when I read it before!â Itâs like that.
- It is said that a good book should be read many times, but in essence, it is an attribute on the part of the reader to be able to accept what is written.
- In that sense, it would be a great reading experience to come across relevant pages of books related to âwhat I am interested in nowâ and read them.
- A: Once you read a book, you donât remember every detail.
- Q: Is the identification of whether the source is from you or someone else based on whether it is a public or private Omni? Or is it based on the content?
- A: In that sense, both are the same now that the private Omni does not contain self-derived data.
- I like the idea of experimenting with mixing the two.
- A: In that sense, both are the same now that the private Omni does not contain self-derived data.
8: Vector search is an opportunity to cut out
- 9/2 Hitting part of a series that has not been carved out provides an opportunity for carving out
- Countless beneficial events that have occurred
- Diaries, chat logs, lecture materials, transcripts of conversations, and other âtime-lined descriptions
- AI searches on a topic and hits that âin the middle of the sequenceâ and mentions it there.
- The person who sees it cuts out that part of the page and creates a new page.
- They did it with the vector search story.
-
I made an excerpt page with âSocial Security Expensesâ in the title.
-
- Topic-oriented cutouts from time-aligned descriptions
-
The balance between time-based and topic-oriented is an important issue in organizing and interpreting information.æé軞ă«æČżăŁăæ ć ±ăźæŽçăŻăæ ć ±ăźæ”ăăèżœäœéšăăăźă«é©ăăŠăăăăçčćźăźăăăăŻăăăŒăæ§ăæąæ±ăăéă«ăŻăăăăăŻæćăźæ°ăăæ§é ăäœăăăă«æçł»ćăźæ§é ăç ŽćŁăăćż èŠăăăăăăăăȘăăäžæčăăăăăŻæćăźæŽçăŻăæ ć ±ăçčćźăźăăŒăăăłăłăăăčăă«ćșă„ăăŠæŽçăăăăšăćŻèœă«ăăăăăăăźéă§ăźé©ćăȘăă©ăłăčăèŠă€ăăăăšăăæ ć ±ăźćčççăȘæŽçăšçè§ŁăäżéČăăă
- Difficult to do in advance - The appropriate way to cut out the need is determined after the need is identified.
- Explanation of the concept of âcuttingâ as many people may be unfamiliar with it.
- In the realm of personal knowledge management, there is a belief that a short page on a single topic is preferable
- Example: Evergreen notes should be atomic..
-
This makes it easier to form connections across topics and contexts!
-
- Similar to the [single responsibility principle
- Example: Evergreen notes should be atomic..
- So the act of extracting a âsingle topicâ from a âpage with multiple topics mixed inâ is performed.
- In Scrapbox, this can be done by selecting multiple lines and choosing New Page from the balloon menu.
- This is commonly known as âcutting outâ and as the name suggests, âcuttingâ and creating new pages and linking them to each other
- Whether âcutâ or âcopyâ is better is debatable, and I donât always think cut is better either, there is value in keeping lecture materials readable through.
- With the minutes being placed in groupware, or something like that, users would be resistant to editing them.
- Scrapboxâs philosophy of âNot a warehouse for dead text.â would say, âDonât use it that way.â
- Whether âcutâ or âcopyâ is better is debatable, and I donât always think cut is better either, there is value in keeping lecture materials readable through.
- This is commonly known as âcutting outâ and as the name suggests, âcuttingâ and creating new pages and linking them to each other
- In the realm of personal knowledge management, there is a belief that a short page on a single topic is preferable
- Behavior due to the fact that it is a vector search for fragments engraved in chunks
- Iâm chopping at 500 tokens right now, which is about the size of a page in a book.
- With chat logs, itâs not an entire thread or one specific personâs statement, but a unit of âseveral exchangesâ.
- The search returns âthe most dense mentions of the topicâ in a long conversation.
- A human discards the unwanted parts of that chunk, picks the relevant ones around it, or writes a new opinion inspired by it and creates a new page, which is beneficial.
- Scrapbox is designed to afford cutouts, which is beneficial when combined with LLM
- Cutting it out could be a problem because you lose the context of the timeline.
- Yes
- I see. âI see what you mean.
- The authorâs philosophy, yes, but I personally think âletâs put everything inâ is better.
- Might it actually be beneficial to keep a stack of books in it?
- Itâs better to put it in this system because it doesnât speak to you when itâs piled up, right? Well, it would eliminate the opportunity to âkind of pick it upâ or something like that, because it would lose its actuality.
9: Vector search serves as a tool for cognitive resolution.
- from 10/3 Teaching LLM to Knowledge Brewing in Scrapbox
-
nishio from my own research notes vector search serves as a tool for cognitive resolution.
-
That is because the âsimilaritiesâ presented Observe what you are thinking about now from a slightly different direction⊠The process of âverbalizationâ includes âincreasing the resolution to describe the worldâ and âre-describing the understanding gained from higher resolution observations in a way that others can understandâ, which are two different things and need to be considered separately.
- Observe what you are thinking about now from a slightly different directionâ increases Cognitive Resolution by thinking â[Similarity â What is the difference? Resolution
- Similarity â What is the difference?
-
Patterns of thought that are beneficial for increasing [Cognitive Resolution
-
When you feel âSimilar.â for two concepts, the fact that you did not feel âthis is exactly the same thingâ indicates that you feel âdifferenceâ there.
-
The âdifferenceâ is not verbalizing at this time.
-
âWhat is the difference?â and then you can observe things in more detail.
- Iâve also compared ChatGPT and omni in this presentation, as well as public and private omni.
-
- Vector search works as a âmechanism to scrape together fragments of similar topics.
- Similar to the KJ method of âgather fragments that may be related in one place, and then think about what kind of relationship they may have.
- Small convergent moves and divergence from them 9/3
-
I thought you looked familiar.
-
- from Interference Effects of Ideas
- - way of thinking p.107
- from Interference Effects of Ideas
-
- There is a difference between âsimilar topicsâ and âsimilar opinions.â
- Different opinions on similar topics are rather close in terms of vector search - Conflict is a close relationship
- Similar to the KJ method of âgather fragments that may be related in one place, and then think about what kind of relationship they may have.
- So youâre substituting a dialogue with yourself with the past for a dialogue with yourself with a different seed: âŠ
- Thatâs one thing, but I also think that what we think is influenced by âthe situation we were in at the time,â as I look at past articles that came up in a vector search.
- I imagine that even at this moment, writing in this Scrapbox, in another Scrapbox, on a social networking service, or in the companyâs groupware produces a slightly different intellectual output than writing in another Scrapbox.
- By bundling them together again
- Thatâs one thing, but I also think that what we think is influenced by âthe situation we were in at the time,â as I look at past articles that came up in a vector search.
- Mr. Nishio is experiencing self-expansion! For Mr. Nishio, how many times more, 1.3x or 2.4x, does it feel?
- Maybe 1.2x - not over 2.0 w
Below are my notes before writing The following is a summary of the information entered
- 7/29: AI writes to Scrapbox, introducing the concept of Scrapbox Agents.
- Early to mid-August: Omoikane study group, discussion on working with AI, writing research notes with AI, evolution of vector search, and topics on handling AI-generated pages.
- Mid to end of August: topics related to updating and search, taking into account multi-head, page memory, and user cognitive load.
- Beginning of September: Explore AI-user interaction, note management, and relevance between different content.
- Mid-September: various topics related to multi-head thinking, the intellectual production techniques of engineers, and working with AI.
- Late September: LLM vs. other models, discussion on human concepts, optimal use of Scrapbox, and feedback on non-public tools. Overall, this period seems to focus on Scrapbox and AI integration, particularly the concepts of vector search and multi-heading, and AI thinking and interaction with the user.
Jul 29 AI writes to Scrapbox - Agents living in Scrapbox. 8/4 Omoikane Study Group
8/11 [/omoikane/ consulted GPT4 on how to proceed after this 2023-08-11](https://scrapbox.io/omoikane/ consulted GPT4 on how to proceed after this 2023-08-11).
8/12 - Co-operation with AI
- Findings: [/unnamed-project/AIâs pace is so fast that humans canât keep up](https://scrapbox.io/unnamed-project/AIâs pace is so fast that humans canât keep up)
8/12 AI writes research notes daily 8/16 16 8/16 Stacking Vector Search Results 8/16 overwrite mode. 8/18 Ignore AI-generated pages from vector search 8/18 A case study of a new combination with the assistance of AI 8/21 multi-head - page memory 8/23 Ignore used pages from all history
8/25 Update page limit 8/26 Pinning effect on the topic
8/29 Lift Ignore from Vector Search on AI-generated pages 8/30 Extending the Red Link with AI 8/31 Main branch stopped 8/31 Questions encourage verbalization, but there are different kinds of questions. 8/31 Introduction to ENCHI 8/31 Clarification of AIâs role is important. 8/31 Update Interval of AI Notes 8/31 A case study of SF prototyping for a junior high school studentâs work experience 9/1 Page as a fluid process 9/1 Trade-off between speculation and development # distress 9/1 What is the role of AI in this project? 9/1 Is there an AI with multiple personalities? 9/1 Why not specify the purpose of each page of the AI note? 9/1 Itâs buried at the bottom of the AI page. # Cause of distress 9/1 Discovering connections between different content 9/1 Summon other peopleâs AI to your diary 9/2 Difference in skin feel between vector search and RAG 9/2 Hitting part of a series that has not been carved out provides an opportunity for carving out +1 9/2 AI canât rest because it will develop endless thoughts. # Cause of distress 9/2 The realization that you can read the URLs of other projects
9/2 Multi-head thinking epoch 9/2 Pages that sometimes emerge
9/3 - Multi-Head Thinkingâ and âThe Engineerâs Art of Intellectual Production.â - Small convergent moves and divergence from them - Write the summary above.
9/4 - [[BELOW_IS_LESS_INTERESTING to BELOW_IS_AI_GENERATED.]]
Iterative Commenter Pioneer mode
~
- Examples of AI providing different perspectives
- Experiments in having AI interpret lyrics
- Difference between Recurrent Notes and Iterative Commenter
9/12
9/13
9/15
9/16
9/18
9/20 Integration in private project
2023/9/22
- Have them verbalize the difference between what is similar to LLM
- (Tentative) Operation not yet named
- Restructuring Thinking and Communication with Scrapbox
2023-09-24
2023-09-29
This page is auto-translated from /nishio/LLMă«ăăç„çççŁæ§ćäžććŒ·äŒ using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. Iâm very happy to spread my thought to non-Japanese readers.