Logo
My Account |  Site Map | Contact Us  
Welcome Guest Search | Active Topics | Sign In | Register

File size Options
Michal Mikielski
Posted: Monday, July 30, 2018 6:37:11 AM
Rank: Newbie
Groups: Member

Joined: 7/30/2018
Posts: 2
Hello

I am generating a PDF from HTML markup using the following code:


Code: C#
HtmlToPdfOptions options = new HtmlToPdfOptions();
options.JpegQualityLevel = 50;
options.OutputArea = new RectangleF(0.5f, 0.5f, 7.5f, 9.25f);

PdfDocument doc = new PdfDocument();

for (int i = 0; i < pages.Length; i++)
{
	string pageHtml = pages[i];

	if (i == 0)
	{
		// first page
		HtmlToPdf.ConvertHtml(pageHtml, doc, options);
	}
	else
	{
		PdfPage pdfPage = doc.Pages.Add();
		HtmlToPdf.ConvertHtml(pageHtml, pdfPage);
	}
}



The output file has 10 pages (which are rather simple) and if there are no img elements in the HTML then the total file size is 262kB, which is acceptable.
However if there are img elements in the HTML then the file size increases significantly (e.g. to 13MB, while the total size of images is only 2MB).
I tried changing the JpegQualityLevel to 1 or 0, but that barely affects the file size (which varies from 13MB to 13.5MB).

With other images the PDF size gets even up to 45MB (where the total image files size is much less, e.g. 5MB).

How can I reduce the impact of images on the total file size?
eo_support
Posted: Monday, July 30, 2018 4:19:51 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 20,523
Hi,

Based on your code you only apply your conversion option (JpegQualityLevel) to the first page but not other pages. So you may want to change that.

Additionally, if you have the same image in your HTML, then the large size can be a result of the same image being stored the in the result PDF file multiple times (once per page). You can use the following strategy to avoid this:

Code: C#
//Convert each page into a separate PdfDocument object
PdfDocument[] docs = new PdfDocument[pages.Length];
for (int i = 0; i < pages.Length; i++)
{
    HtmlToPdf.ConvertHtml(pages[i], docs[i]);
}

//Merge them into a single PdfDocument
PdfDocument result = PdfDocument.Merge(docs);

//Save the result
result.Save(file_pdf_file_name);


Please let us know if this reduces the file size for you.

Thanks!
Michal Mikielski
Posted: Tuesday, July 31, 2018 4:42:09 AM
Rank: Newbie
Groups: Member

Joined: 7/30/2018
Posts: 2
Thanks a lot.
Applying options to each page individually as well as merging separate docs results in great size reduction and similar size (3MB in this case).
eo_support
Posted: Tuesday, July 31, 2018 2:56:24 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 20,523
Great. Glad to hear that it works for you! Please feel free to let us know if you have any more questions.
nikfe
Posted: Monday, August 6, 2018 12:13:02 PM
Rank: Member
Groups: Member

Joined: 5/25/2015
Posts: 20
Btw this only optimize image compression but if images are big in resolution wise it's different case.

I have issue where customer is complaining that identical document created with MS Word is about 4-5x smaller and therefore result PDFs are too big to be sent as email attachment.

With external PDF program I confirmed that ~99% of size is from images (compression rate already high) and after applying resolution reduction to 150ppi I was able to get that 4-5x reduction in size. Previously there was option to auto reduce image sizes (which was unusable in practise since ppi wasn't parametrized).

Processing resolution could be quite trivial as:
Quote:

void PostProcessImages(object sender, PdfPageEventArgs args)
{
var images = args.Page.Contents
.Flatten(c => c.Contents)
.OfType<PdfImageContent>();

foreach (var image in images)
{
image.AutoScale(); //Or what ever custom logic with image.Image
}
}

...
options.AfterRenderPage += PostProcessImages;
...


...buuut since EO handles PDF as write only and Page.Contents always return only raw content this won't work.

Is there currently anyway to scale images or really read PDFs?
eo_support
Posted: Monday, August 6, 2018 4:34:12 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 20,523
Hi,

There is no way to do image compression on existing files for now. Hopefully we can implement this feature in the future.

Thanks!


You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.