I practiced English conversation until I used up the GPT quota yesterday before sleep. Reviewing the chat logs in the morning and organizing them is a good practice for improving my English. It generates English contents and serves as a way to express my thoughts in English. - ChatGPT and English Conversation 2023-12-20

Observed cases of conflict in the process of title translation num collision 54

  • https://gist.github.com/nishio/2a732b8b5ad9291bfaaf7603f362254b (v3)

  • Noooooooo, I don’t want to be identified with this.

    • [Subjectivity]: [idiosyncrasy] / [independence]
  • [Common Sense]: [what is usual] / [natural]

    • I think not.
  • [Community]: [Community] / [community]

    • hmm
  • Two cases left, difficult to do, more to come tomorrow.

    • [Diversity]: [diversity] / [pluralistic]
    • [Empathy]: [empathy] / [the dharma-body (mind, body and spirit)]

To be done on 2023-12-21. Resolution of the remaining two cases

  • ✅ Start translation
  • Maybe the long one is an error.
    • Put on failure list and continue.

2023-12-21

  • Translation system, I’m wondering if it’s not necessary to export JSON since it’s crawling to get links anyway? I’m starting to think so.

Problem of not preserving indentation

  • A key feature of your translation is the preservation of formatting, especially spaces and tabs at the beginning of lines. These are crucial for indicating indentation in bullet lists and must be accurately replicated in your translations.

  • Translation in GPT 3.5, up to the 59th case done.

    • And as expected, an error in an overly long article.

:

20%|███████████████████▎                                                                              | 59/300 [04:55<20:05,  5.00s/it]

openai.BadRequestError: Error code: 400 - {'error': {'message': "This model's maximum context length is 4097 tokens. However, your messages resulted in 6138 tokens. Please reduce the length of the messages.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}
python -m tasks.translate.from_jsonl  9.23s user 2.91s system 4% cpu 5:01.91 total

that’s so

  • Proposal to go through the article that is too long and work through it anyway.
  • I’ll just go through and translate what I can.
  • The page with the error only records the nature of the error and aims to run to the end for now.
    • Execution started.
    • See you tomorrow for the rest of the story.

A note for my future self

  • Once completed, look at the time and number of pages processed
  • You can see how many pages are logged as errors
  • Check the fees charged.
  • I think most of the errors are pages that are too long.
    • What to do with long pages needs to be considered
    • There are cases that do not need to be translated and cases that should be translated exactly.
    • Simply a case of doing it with a context wide API.
    • Cases of split translation

2023-12-23

  • I checked the morning of 2023-12-22 and it was at 38%, but apparently it hung at that point.
  • 38%|█... | 6668/17569 [11:16:08<7:32:13, 2.49s/it]
  • I just looked at it and it wasn’t progressing. Hmmm… failure.
    • Hangs silently without error
  • image

2023-12-24

  • 47%|█... | 8310/17569 [00:23<00:39, 234.63it/s]
  • return self._sslobj.read(len)
    • I’m blocking it on READ.

This page is auto-translated from /nishio/pEnglish2023-12-20 using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.