Need assistance / ideas - Translating large Japanese scanned PDF files

Akagi · 16 Nov 2023 at 20:39

Hi all,

I have several large PDF files which are in Japanese, I need to translate them to English with text-in-place so that the formatting and imagery are retained.

Google Lens does an excellent job of translating any and every part of these documents, however Google Docs Translate claims that it cannot translate scanned documents... Which makes no sense because I assume they would be using the same technology, but never mind...

I had the idea to try using a Japanese optical character recognition (OCR) software and then using Google Translate afterwards, however these have proven to be extremely unreliable and ruin the document formatting...

There are around 1500 pages, so while I could screenshot every single page and use Google Lens on my phone to translate them, I'm sure you can see why that really isn't an option!

Any ideas?

Thanks

Akagi · 16 Nov 2023 at 20:57

Update: I can easily convert the PDF file to images, and then convert images back to a PDF file.
So services that can only translate images are fine, however Google Translate only does one image at a time.

I did find one that bulk translates images using Google translate, but it costs $100 for 250 un-watermarked images! How they think they can charge that for a script that automates a free service is just baffling.
I am now looking for a free / much cheaper one...

NVP · 17 Nov 2023 at 07:39

Write a script to parse them through Google Translate yourself?

Akagi · 17 Nov 2023 at 07:48

NVP said:
Write a script to parse them through Google Translate yourself?

I would if I knew how.

I've since done about 750 pages manually

At a rate of one every 3 or 4 seconds its not too painful if I have some music going...

NVP · 17 Nov 2023 at 07:54

Haha nice going

Efour · 17 Nov 2023 at 07:56

Anime or ideas for tattoos?

Akagi · 17 Nov 2023 at 11:10

Efour said:
Anime or ideas for tattoos?

Porn mainly

Begbie · 17 Nov 2023 at 11:16

Can you use chatgpt someway? You can upload images and even PDFs now.

NVP · 17 Nov 2023 at 12:11

"Dear ChatGPT, could you Ask Jeeves to find me a free, online, bulk OCR tool for Japanese script? You absolute star. All the best, Acme x"

Mysterae_ · 17 Nov 2023 at 22:53

Efour said:
Anime or ideas for tattoos?

Chicken soup.

kaiowas · 18 Nov 2023 at 08:11

Efour said:
Anime or ideas for tattoos?

Sounds like a workshop manual for his latest JDM purchase to me

Diddums x · 18 Nov 2023 at 08:54

What car are these schematics for?

Akagi · 18 Nov 2023 at 18:33

Diddums said:
What car are these schematics for?

Alto Works

Akagi · 20 Nov 2023 at 14:50

forums.mightycarmods.com/forum/technical/how-to-forum/816855-translated-jdm-alto-works-manuals-ha11s-ha21s-hb11s-hb21s-hc11v-hd11v

Done done...

NVP · 20 Nov 2023 at 14:56

For ulez?

Akagi · 20 Nov 2023 at 16:08

NVP said:
For ulez?

eh what?