prev pScrapboxAutoTrans2023-04-18
- 2023-04-27
- At the end of the last issue, we confirmed that the once-a-day automatic translation is working stably.
- Previously, we felt it would be better to change the name and tentatively titled it pUnnamed, but we have reverted to the traditional pScrapboxAutoTrans.
- I thought that the point at which this βcontinuous translationβ was achieved was a break in the times.
- pContinuousTranslation
from Continuous translation (DeepL) Continuous Integration and Continuous Delivery share common patterns of automation, frequent updates, rapid feedback, and quality assurance. By applying these patterns to the domain of translation and international communication, we can propose a new concept of βcontinuous translation.
- Continuous Translation (CT), like CI/CD, is a system in which translation and cross-cultural communication are constantly updated, improved, and refined using automated processes. the main components of Continuous Translation are - Automation: Utilizes machine learning and AI-powered translation tools to automatically translate content as it is created and updated, minimizing manual intervention. - done - Frequent updates: Implement a system where translations are continually updated to reflect changes in source content so that international audiences always have access to the latest information. - done - Rapid feedback: The incorporation of feedback mechanisms (e.g., suggestions from users, opinions of professional translators, etc.) allows for quick identification and correction of translation errors and inconsistencies, resulting in higher quality translations over time. - Ah, I see. This is so hard to achieve with Scrapbox. - Quality Assurance Establish a process to automatically test translations and verify accuracy and fluency before delivering them to end users. - I wonder if βautomatically test translationsβ means that the LLM reads them and gives feedback on what he/she doesnβt understand? - Collaboration: Developers, translators, and other stakeholders work together to create an environment that seamlessly integrates the translation process into the content creation workflow. - This is the same developer and translator now, so I wouldnβt worry about it.
- Ongoing translation enables organizations to efficiently deliver up-to-date, accurate, and high-quality translated content to global audiences, improving communication, enhancing user experience, and strengthening international relationships.
- Efficiently deliver up-to-date, accurate, high-quality translated content to a global audienceβ sounds like a good mission.
This of pScrapboxAutoTrans2023-04-18.
-
I wish I could quantitatively determine if itβs getting better.
- Considering that continuous execution is now possible and we are in the quality improvement phase, it is strange that there are no metrics to measure the quality (software engineering context).
2023-04-27
- I wrote [Polis Experience Report: Should we investigate the causes of terrorism?
- I noticed that there are some misalignment of icons and such, so I better go read it and write down what kind of problems and Possible solutions I have.
Title.
Polis Experience Report: Do you want to investigate the causes of terrorism?
- Do you want to do it? Itβs like, βDo you want to do it?
- Japanese tends to omit the subject, especially in titles with a limited number of characters.
Polis Experience Report: Should we investigate the causes of terrorism?
Icon Location
- I guess this βputting an icon at the beginningβ and βputting an icon at the endβ is a kind of syntax.
Link
- why?
- Long links are broken up by spaces in the middle of the link.
[detail https://scrapbox.io/nishio/Polis:%E3%83%86%E3%83%AD%E3%81%AE%E5%8E%9F%E5%9B%A0%E7%A9%B6%E6%98%8E%E3%81%AB%E3%81%A4%E3%81%84%E3%81%A6%E3%81%AE%E8%AD%B0%E8%AB%96#643d39f4aff09e0000dd6581]
translation
These are mistakes that can be understood without reading the English text.
Then view the English text at DeepL
- No? - No. Surprisingly, thatβs not a problem.
- Where I thought it was strange, I checked and the English-Japanese translation failed.
fixed
- run local_batch
- Check the results when you return from English conversation school.
Itβs not over at all. :
9%|ββββββββ | 1279/14848 [2:45:52<21:35:30, 5.73s/it]
503 Server Error: Temporary Error for url: https://api.deepl.com/v2/translate
14%|ββββββββββββ | 2037/14848 [4:19:43<77:17:07, 21.72s/it]
('Connection aborted.', ConnectionResetError(54, 'Connection reset by peer'))
26%|βββββββββββββββββββββββ | 3862/14848 [7:05:29<13:20:20, 4.37s/it]
The translation ran for 3 hours, with an estimate of 15 hours remaining (this was written when I got home).
- (Itβs finally running for 7 hours and showing 13 hours remaining.)
- The whole retranslation is rare and itβs due to the fact that itβs not fast enough, but I donβt know what to do with it. policy
- Either way, at the current pace, weβre going to run into periodic translations, so weβll have to stop for a while.
- Implement parallelization and stop the current translation and replace it.
The code below sends a request to the API of the translation service. The current code below is slow because API access is serialized. Please parallelize it to make it faster. python
for page in pages:
for line in lines:
line["text"] = translate(line["text"])
As you have requested, you can use Pythonβs concurrent.futures library to parallelize and accelerate requests to the translation API. Sample code is shown below. python
import concurrent.futures
def translate_line(line):
return {"text": translate(line["text"])}
# Parallelize translation requests
with concurrent.futures.ThreadPoolExecutor() as executor:
for page in pages:
# Send translation request for each line
translated_lines = list(executor.map(translate_line, lines))
# Restore translated lines to original list
for i, translated_line in enumerate(translated_lines):
lines[i].update(translated_line)
Convenience
:
cache length: 273667
40%|βββββββββββββββββββββββββββββββββββββββββββββββββββββ | 6003/14848 [18:22<4:18:18, 1.75s/it]503 Server Error: Temporary Error for url: https://api.deepl.com/v2/translate
50%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 7461/14848 [38:26<1:00:26, 2.04it/s]503 Server Error: Temporary Error for url: https://api.deepl.com/v2/translate
51%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 7606/14848 [41:43<1:25:02, 1.42it/s]503 Server Error: Temporary Error for url: https://api.deepl.com/v2/translate
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 14848/14848 [53:34<00:00, 4.62it/s]
total 29370658 no_cache 1581996 ratio 0.05386314463911568
translate: 3218.0509019168094
Oh, the rest was done in less than an hour.
- About 3,000 yen.
next pContinuousTranslation2023-06-01
This page is auto-translated from /nishio/pContinuousTranslation2023-04-27 using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. Iβm very happy to spread my thought to non-Japanese readers.