I used it with voice input because some people wanted to use it with voice. Not good for use with voice. It is fatal that important keywords are not recognized and are collapsed. Then I switched to text input and realized that âI want to use it with voiceâ is not the same as âI want to use it with voice inputâ. https://keicho.netlify.app/#talk=yaICpGfZchliFVmnnNM3
What would you like to see happen in this conversation? I would like to test voice input because there are quite a few people who say, âOh, it would be nice if we could use this system with voice.
What type of audio is that? Right now Iâm taking a walk and using my AirPods for voice input. What type of voice input is that? Itâs a bit annoying to start voice input because now you have to tap the voice input button on the keyboard every time you want to start voice input. Is there anything else about that voice input? Also, this chat storm⊠Iâm trying to send it to you, but I canât. When I press the enter key, it sends the message, but it doesnât seem to recognize the characters entered when the enter key is pressed in the voice input, so I have to quit the voice input every now and then. Itâs a hassle to have to long the man-male button. How do you know its voice input? Aside from voice input, I donât read out the response from the chatbot now, so I have to look at the screen, which is a hassle. Where is that voice input located? I tap the voice input button on the keyboard every single time. Where is that voice input? It is a standard feature of the iPhone
-
Aside from voice input, I donât read out the response from the chatbot now, so I have to look at the screen, which is a hassle.
- What does this voice input look like? I think people who want to use voice input for e-mail probably envision using it as if they were having a conversation with a human, but in the field, you have to tap a button to input voice data and read what appears in the rain by hand, so it seems a little less smooth. Is there anything else you can tell us about that audio? I donât know, 9sawa, is that really necessary?
-
I tap the voice input button on the keyboard every time.
- What type of keyboard is this keyboard? NGKW Keyboard
-
I think that people who want to use voice input for e-mail probably envision using it as if they were having a conversation with a human, but in the field, you have to tap every single button to input voice data and read what appears in the rain by hand, so it seems a little less smooth.
- Where is this audio located? Maybe the destination needs to be audio, too. How do you know its audio? I think I heard you say youâd like to see the audio of the response, and youâd like to give an audio answer to that. What does that audio sound like? I guess they want to communicate with other humans as if they were having a voice conversation, which Iâm not really interested in. What is it that you value? So, what I value is that the output of text remains for six minutes, and by putting it in through my eyes, it disappears the moment I say it out loud, whereas with voice, it disappears the moment I say it out loud, but with text, it remains in my eyes for two days, and that synergistic effect is beneficial to human intelligence. I think I feel that the synergistic effect is beneficial to human intelligence by the fact that the text stays in my eyes for a while. I think that the synergistic effect of voice input is that it is beneficial to go to the early morning of memory, and that is what I feel the value of. This synergistic effect is beneficial to human intelligence, and I think they feel that it is valuable. What should happen with that voice input? In fact, when a human being coaches with this technique, he or she repeats back what the other person has said, and it functions as a mechanism to input what was said into the head again, rather than cutting it off the moment it was said. There are two functions that you can get. What needs to happen to voice input for that to happen? In fact, when a human coachee uses this technique, he or she repeats back what the other person has said, and this works as a mechanism to input what was said into the head again, rather than cutting it off at the moment it was said, but that is a function that is easily available if you use text rather than sound, unless you have already used voice input. It is a feature that can be easily obtained if you type in text rather than sound. Can that audio do it? When something like this happens, the flow state is interrupted, and you canât immerse yourself in the object of post-thought, which is a big disadvantage, so I think itâs better to say that voice cannot be used in the field. Where does the audio come from? Text input from here. To summarize so far, voice input/output works negatively for the intrinsic value of this tool, because â What should happen to that audio? This is important because âwhat you output is logged as text, and primary memory is augmented by putting the log on the screen through your eyes,â and when a live person actually uses this technique orally, he or she repeats back what the other person says. What needs to happen to the voice for that to happen? During that repeat-back, the latest care should be taken not to change the words used by the other party. The voice input/output must be able to repeat what the user has said without making mistakes. Can voice input do that? The technology at the moment is not even close. Important keywords like âlogâ and âshort-term memory enhancementâ are all over the place when typed by voice. Where is that audio? Itâs better to just repeat back the recording without applying voice recognition. What is the relationship between its voice and voice input? Yes, thatâs it. When a user wants to âcommunicate by voice,â what is really needed is not to recognize the voice and string it together. It is more likely to be correct to give feedback as it is spoken, or to find and cut out the keyword part from the inflection and so on as it is spoken. Is that voice the same as voice input? Is it different? I assumed they were the same, but they werenât!
-
Yes, thatâs it. When a user wants to âcommunicate by voice,â what is really needed is not to recognize the voice and string it together. It is more likely to be correct to give feedback as it is spoken, or to find and cut out key word parts from the inflection and so on as it is spoken.
- What happens to the voice input when this voice It is not used for the dialogue itself. The difference is whether to âlisten backâ or âread backâ when reading the dialogue log later.
-
Yes, I know that I donât feel the value of voice input for the time being, but what I do feel the value of is that the output as text remains for 6 minutes, and by putting it in through my eyes, it disappears the moment I say it out loud if itâs voice, but if itâs text, it stays in my eyes for a while for 2 days. I think that the synergistic effect is beneficial for human intelligence by saying that the text stays for a while in the second day. What I feel the value of is that the output of text remains for six minutes, and by putting it in through my eyes, it disappears the moment I speak it out loud if it is voice, but if it is text, it remains for a while on the second day, which makes it possible for human beings to have a synergistic effect on their memory at one oâclock in the early morning. In the early morning, the synergistic effect is beneficial to human intelligence, and you feel the value of it.
- What happens to the voice during this voice input? To string by voice input prevents smooth voice interaction in the current system. If you want to do it by voice, the system should repeat back only by voice and not look at the screen. What is between that voice and voice input? There should be keyword extraction in audio form before audio is turned into strings!
-
To string by voice input prevents smooth voice interaction in the current system. If you want to do it by voice, the system should repeat back only by voice and not look at the screen.
- Where does this voice input come from? I wonder where it comes from, maybe someone is researching keyword extraction in the form of audio. What is the relationship between its audio and repeatability? If youâre communicating by voice, you canât see the log screen, so you need to repeat back to the voice. Is that audio the same as repeat? Is it different? Different. With text, the log serves the function of repeat back.
-
If you communicate by voice, you canât see the log screen, so you need to repeat back to the voice.
- What happens to the repeat during this audio? Iâd think it would take longer to get an audio repeat back than it would to look at the log. What happens to the audio during that repeat? You might want to interject and rephrase.
-
I think it would take longer to get an audio repeat back than it would to look at the logs.
- What is between this audio and the repeat? Iâd suggest removing filler like âuhâ or something like that instead of repeating back the whole recording.
-
I think it would take longer to get an audio repeat back than it would to look at the logs.
- What is the relationship between this audio and text?
This page is auto-translated from /nishio/äŒè©±ăă°2021-01-30 using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. Iâm very happy to spread my thought to non-Japanese readers.