Welcome Guest Search | Active Topics | Sign In | Register

PDFDocument.Merge reduce combined PDF file size Options
Jeff Whitlock
Posted: Tuesday, April 18, 2017 3:15:59 PM
Rank: Newbie
Groups: Member

Joined: 6/10/2016
Posts: 8
Currently using the PDFDocument.Merge method to combine multiple existing PDF files into one document. The resulting PDF file size is the sum of the individual sizes of the PDFs being processed. For instance, if I have three PDF files of 200k, 170k, and 130k to combine, the resulting combined PDF file will be around 500k in size.

Does the Merge method (or another PDFDocument method) have the ability to compress the resulting PDF file by de-duplicating all of the redundant elements in the documents? Adobe Acrobat, for instance, has a merge function that will take either multiple PDF files, or a large PDF file and merge redundant elements to bring the file size down.

The PDF files that I'm combining are all single page PDF documents of the same format (invoices). Experimenting with Acrobat was successful in significantly reducing the size of combined files, but I'd like to be able to do it in the app I'm developing using the EO components if possible.

How can the resulting combined file size be reduced?

Thanks!
eo_support
Posted: Tuesday, April 18, 2017 3:36:30 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,080
Hi,

It does do that to a certain degree. Specifically, it tries to de-duplicating the font glyph information, which usually is the biggest contributor to large file size. However to faithfully preserve everything, we have a rather strict criteria on what font data we will try to de-duplicate: the two fonts in the two files must be an exact match except for their subset (PDF file does not embed the whole font file that covers all characters, instead it only embed a subset of all characters used by the file). In the past we have less strict criteria which results in smaller files but sometimes cause problems when the two fonts may have the same name but are in fact slightly different.

We do not do de-duplicating or re-compressing images. Again, this is to the principle of preserve everything as much as possible since most of the time a lose of quality is not acceptable to our customers.

Thanks!
Jeff Whitlock
Posted: Tuesday, April 18, 2017 3:51:45 PM
Rank: Newbie
Groups: Member

Joined: 6/10/2016
Posts: 8
Thanks for the info.

I had a situation where combining invoices into one PDF file using PDFDocument.Merge resulted in a 25mb PDF file size for the combined files. When we ran it through Adobe Acrobat Pro's merge option, it reduced it to about 1.4mb in size, which indicated there was quite a bit of duplicate information in it from the combined files.

It would be a nice enhancement to the EO component to be able to get some combined file size reductions like that while we're doing the merge.

Thanks again for the clarification.
eo_support
Posted: Tuesday, April 18, 2017 5:14:20 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,080
Hi,

You can send the files to us and we will be happy to take a look to see what we can find. See here for information on sending test files to us:

https://www.essentialobjects.com/forum/test_project.aspx

Thanks!
Pejman
Posted: Thursday, October 11, 2018 2:31:37 PM
Rank: Newbie
Groups: Member

Joined: 10/11/2018
Posts: 3
Did anything come out of this topic? I have the same issue with merging 1000 pages ends up with a pdf file with the size of 86 MB and when I save it in adobe, its size gets reduced to 26MB!
Jeff Whitlock
Posted: Thursday, October 11, 2018 3:14:12 PM
Rank: Newbie
Groups: Member

Joined: 6/10/2016
Posts: 8
We ended up using a separate utility to do the compression on the EO file after we combined it.
Pejman
Posted: Thursday, October 11, 2018 3:45:11 PM
Rank: Newbie
Groups: Member

Joined: 10/11/2018
Posts: 3
Could you please tell me what you used? is it .Net library? How fast is it?
Jeff Whitlock
Posted: Thursday, October 11, 2018 3:53:37 PM
Rank: Newbie
Groups: Member

Joined: 6/10/2016
Posts: 8
We used the PDF Compress utility from 4dots. You should be able to Google it. Works fairly well for us.
eo_support
Posted: Thursday, October 11, 2018 4:18:38 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,080
Hi,

Adobe Acrobat Pro creates linearized PDF that basically compress the whole PDF file and this process can significantly reduce the file size. We do not support this feature yet. Hopefully we will be able to support this in the future. This does not have much to do with merge --- our merge does automatically removes redundancies, but its the final compressing pass that can make quite a difference.

Thanks!
Pejman
Posted: Tuesday, October 30, 2018 5:25:07 PM
Rank: Newbie
Groups: Member

Joined: 10/11/2018
Posts: 3
That will help us a lot if you guys do this, is there any time estimate for it?
eo_support
Posted: Thursday, November 1, 2018 8:34:12 AM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,080
Unfortunately we do not have an exact time frame on this feature yet. Hopefully we will be able to implement it in the 2019 release cycle.


You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.