Fairly Advanced Dev Question: PDF Manipulation

Associate
Joined
21 Oct 2008
Posts
1,679
Location
Mooching... in your house
Hi guys,

So, I'm thinking some kind of server side processing facility is going to need to be implemented (maybe some web-integrated version of Distiller or something?) but what we need to do is this:

... from a website, be able to pass something a PDF, which could be a collection of vector and raster and sometimes quite complex, and for it to kick back out a greyscale version.

That is the requirement which sounds simple but after looking into it for a while it doesn't seem possible with any php pdf manipulation libraries we've found to simply "convert to greyscale" - we've found ways to do it with images (i.e. strip them out, run them through something like imagemagick and pop em back in but thats not ideal)...

So, how would you guys approach this? I know there are some talented souls on this forum :)
 
I might be going off on a tangent here, but Acrobat Pro can convert to greyscale, and gives the exact results we are looking for... thing is I have no idea if its even possible to utilise what is essentially a desktop application to process something for the web... maybe acrobat is installed on a mac server somewhere and using applescript or something a folder could be monitored, when it sees a new file it chucks it into acrobat and out comes a greyscale version?

TBH, this sounds really far fetched I'm just thinking out loud...
 
I've done a bit with PDF creation/manipulation using ASPPDF, although nothing to do with greyscaling at all.

I've had a shifty through their object reference and it doesn't look like they have a method to convert an entire PDF to greyscale, just to change image profiles and embed them, or draw vector shapes onto the PDF. Neither is going to help here really.

It does have a PDF to image function which could possibly do what you want, but it's a very long winded way of doing it.

I'm guessing the PDFs need to be greyscaled for OCR or something?
 
yeah they have to be greyscaled as its something that is sent to print and there are colour and greyscale options...

we are using TCPDF which sounds like a PHP version of ASPDF (we were using FPDF but it was quite limited) but greyscaling is certainly not an option...

This is some tricky shizzle, we did consider converting to an image but the resulting file sizes were unrealistic for bulk transmissions.
 
Why not just convert it with adobe then print?

How do you receive the pdf is it from a file upload on the server? I was thinking instead get the files sent to an email address download them all and batch process?
 
Unfortunately this is no little project where manual interjection is even vaguely viable...
the whole thing needs to be unmanned, you'll see why when we launch this beauty :)

But yes, the source is someone uploading a PDF, which could be colour but may need to be B&W ;)
 
If i remember, Imagick library supports PDFs so you should be able to greyscale with that.
Alternatively, render PDF to an image use Imagick to greyscale and recreate PDF.

Edit - Yup, look at Dracata's post; Imagick will do the trick.
 
Back
Top Bottom