Logo
My Account |  Site Map | Contact Us  
Welcome Guest Search | Active Topics | Sign In | Register

Converting HTML to PDF fails when running in docker container Win 2019 Options
System Monitor
Posted: Friday, October 25, 2019 11:58:09 AM
Rank: Newbie
Groups: Member

Joined: 4/27/2018
Posts: 6
We have verified that the latest version (2019.2.69.0) works fine in Win 2019 outside a container. However it doesn’t work inside a docker container

Exception Details:
Exception Type: HtmlToPdfException
Message: Conversion failed. This WebView either has already been destroyed or is being destroyed.

StackTrace:
at EO.Internal.aj9.a[a](aox A_0)
at EO.Pdf.HtmlToPdf.ConvertHtml(String html, PdfDocument doc, HtmlToPdfOptions options)


We tried to enable EO worker process (https://www.essentialobjects.com/doc/common/eowp.aspx) but still same error.

Has anybody made it work on docker environment with Windows 2019?

Would EO support docker environments?

Thanks.
eo_support
Posted: Friday, October 25, 2019 4:12:56 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 22,273
Hi,

We currently do not test our product in Docker environment. We will see if we can support this in our 2020 release cycle.

Thanks!
System Monitor
Posted: Tuesday, January 7, 2020 9:48:42 AM
Rank: Newbie
Groups: Member

Joined: 4/27/2018
Posts: 6
Hi,

When is your next release for 2020, and do you know if docker support is/will be included in it?

Thanks.
eo_support
Posted: Tuesday, January 7, 2020 10:43:04 AM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 22,273
Hi,

Our initial 2020 build should be out in about two weeks. However the initial build will not officially support/be tested in Docker environment. After the initial release, we will look into Docker and see if it is possible for us to support it. So we should have a definite answer in a month or two on this issue.

Thanks!
System Monitor
Posted: Tuesday, January 7, 2020 8:04:49 PM
Rank: Newbie
Groups: Member

Joined: 4/27/2018
Posts: 6
Thank you for the reply. Looking forward to your Jan release and we can test for you if docket works;-)
roger reynolds
Posted: Sunday, February 16, 2020 6:24:55 PM
Rank: Advanced Member
Groups: Member

Joined: 3/11/2014
Posts: 57
Hi, we're hitting the same problem trying to generate PDF from a docker container using the 20.0.53.0 build, and also using the InitWorkerProcessExecutable method to configure eopw over rundll.


This is a pretty big problem for us. Any update on if/when we might expect this to be resolved?

Thanks
roger
eo_support
Posted: Tuesday, February 18, 2020 6:41:51 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 22,273
Sorry about the delay. We have investigated this issue. The issue is not related to docker, but rather the base OS image used by the docker container.

Specifically, EO.Pdf should work if the docker container is based on full Windows image. However you will run into issues if the base image is:

1. Nano Server. This is the default base image for .NET Core application. Nano server lacks many core features needed to run the browser engine (which is what EO.Pdf uses to render HTML). So it is not possible to run EO.Pdf on Nano server;

2. Windows Server Core. This is the default base image for .NET Framework application. Technically EO.Pdf does run on Windows Server Core since it has almost everything EO.Pdf needs except for one thing --- it only has one font and it does not seem to allow installing font. Lacking font support will cause EO.Pdf fail to render text. Thus for example, if you call ConvertUrl to convert Google's home page, pretty much all you will get in the result file is Google's logo. Obviously this makes it useless;

In theory you can use the full Windows image as base image. But that probably defeats the purpose of using docker at the first place. So that leaves us at the unfortunate position of not having a good solution for this.
roger reynolds
Posted: Wednesday, February 19, 2020 12:08:09 PM
Rank: Advanced Member
Groups: Member

Joined: 3/11/2014
Posts: 57
Thanks for that response. It mostly confirms what we've observed, which is that HtmlToPdf.ConvertUrl works with some container images and doesn't work on others.

Our observation is that it works on containers built from windows version 1607 (windows version Version 10.0.14393), even though the container image is microsoft/windowsservercore

Other images we've tried that fail are mcr.microsoft.com/windows/servercore:1809 (windows version 10.0.17763.914), mcr.microsoft.com/dotnet/framework/runtime:4.8 (also an 1809 variant), and mcr.microsoft.com/windows/servercore:1903.

There are two ways we generate HTML. One is to use the HtmlToPdf.ConvertUrl method. The other involves using a WebView to render the page, use its GetHtml method to capture the html for the page, and then use HtmlToPdf.ConvertHtml to generate the PDF. Thing is, the webview.GetHtml is where we lose. It comes back empty in those containers where ConvertUrl would fail, and works in those containers where ConvertUrl would succeed. So, the point is, it appears to be not be the PDF conversion that fails, but the HTML rendering process. Does that sound right?

Can you speak to how the conversion is expected to fail? We've seen varying results. Sometimes a "web view is destoyed". Sometimes an object ref or other exception is thrown. Most times though, there are no exceptions but the PDF (or HTML in the case of webview.GetHtml) just comes back as empty with no indication that anything went wrong.

Also, can you speak to whether the use of HtmlToPdf.ConvertUrl is thread safe and can be called concurrently by multiple threads? It seems to work, but i'd like to confirm that it should be OK and i'm not just getting lucky.

For example, one solution i'm working on is to provide a web service that accepts a url and some PDF options and then does the ConvertUrl to capture the PDF and stream it back to the requester. I limit the total number of concurrent conversions to some number, say 5, in order to limit the amount of memory from blowing out. In general, should this approach be expected to work?

The service is deployed to an azure web site running windows version 6.2.9200 (windows 8.2, build 9200) which should be a supported OS version based on your response. But, even in that environment, i am finding the ConvertUrl works sometimes, and other times does that "works but empty" thing for the same page. This is even when i limit the number of concurrent requests to 1. I need to do a little more follow up here to make sure i don't have some other issue, but i'd like to know if i should generally expect this approach to work or not.

Thanks
roger
eo_support
Posted: Wednesday, February 19, 2020 2:00:15 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 22,273
Hi,

Yes. Your observation is correct. Our HTML to PDF converter can work on earlier builds of Windows Server Core Container with no problem at all. However somewhere along the way, Microsoft decided to disable all non-crucial fonts from Server Core Container Image:

https://blogs.windows.com/windowsexperience/2018/05/29/announcing-windows-server-2019-insider-preview-build-17677/

After that it no longer works on the Windows Server Core Container. It still works fine on regular Windows Server Core since on the regular version fonts can be installed.

The root of the problem is a crash in the rendering process. At one point of the page's life cycle, the render process will try to load the necessary fonts. This can either be triggered by a rendering request (think HTML to PDF) or a background layout request (when the browser engine was just parsing the HTML it received and tries to build the layout of the page). This second case is the most common case since the HTML layout can not be properly established without the proper font data. When the render process is not able to find the correct font, it will try to use the Windows's font substitute mechanism to find a suitable fallback font. This process will work on any system that has a UI. However on Windows Server Core Container image, this process will fail, thus cause a check crash. Obviously this crash can occur any time since layout is done in the background all the time. So it is not just limited to HTML to PDF. It is technical possible to avoid this check crash by explicitly supplying all the font data using @font-face CSS directive, but obviously that's not a practical solution for most users.

As to ConvertUrl, yes, you can call it in multiple threads. This method is specifically designed to be multi-thread safe. In fact the static HtmlToPdf.Options object is thread static. So every thread has their own copy.

Hope this helps. Please feel free to let us know if you have any more questions.

Thanks!

roger reynolds
Posted: Wednesday, February 19, 2020 4:26:41 PM
Rank: Advanced Member
Groups: Member

Joined: 3/11/2014
Posts: 57
Thanks for the detailed response. That is very helpful.
One thing that isn't thread safe on HtmlToPdf is the DebugConsole. I tried to use it for multiple concurrent requests and quickly discovered that it is global.

In my particular situation, fully specifying fonts using @font-face might actually be doable. But, to be clear, would that allow the page to render with the requested font, or to fall back to the one fixed width font that is present on core image? If the former, this might be something i would pursue. If the latter, it probably wouldn't be worth it. Either way, what would help immensely would be if when it failed it produced some diagnostic information like the specific font that was not found. Even better if it knew the css rule and location that failing font.

Thanks
roger

roger reynolds
Posted: Thursday, February 20, 2020 11:58:53 AM
Rank: Advanced Member
Groups: Member

Joined: 3/11/2014
Posts: 57
I have good news.

By simply copying c:\windows\fonts\*.ttf to my container image, it appears that is enough to let both HtmlToPdf.ConvertUrl and webview.GetHtml to work in the server core images where they were failing previously.

So, if what has been said so far is really true - that the only thing preventing this from working was the missing fonts, then i should be all good.
Time will tell on that...

i should say, the version of EO that i have working now is an older version of 2019, not the latest 2020.
I am not expecting any surprises when we upgrade to that, right?:)


Thanks again for the detailed responses. They were very helpful to us in narrowing in on a solution here.
But, to reiterate - the one thing that would brought this to a resolution much faster, without requiring a support interaction, would have been if rather than silently generating an empty file, the ConvertUrl method would have thrown some sort of exception regarding the inability to resolve the font.


Anyway, hope the tip about copying the ttf files to the docker image helps somebody else out there.

roger


eo_support
Posted: Thursday, February 20, 2020 2:42:02 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 22,273
Thanks for sharing the tip ---- all over the Internet says fonts can not be added on Windows Server Core container image, I guess you just proved the Internet wrong. :)

We did test the 2020 version by copying all fonts to the container and it does work.

As to your suggestion about throwing an exception about missing font, this would be nice but is not practical to implement. The issues originates from a check failure inside Chromium browser engine and there are numerous such checks all over among Chromiums over 30 million lines of code. We were able to pinpoint to the exact point of failure by debugging into Chromium's source code. But obviously it is not doable for us to investigate every possible check failure. Nevertheless, your feedback is still extremely appreciated.

roger reynolds
Posted: Thursday, February 20, 2020 2:59:56 PM
Rank: Advanced Member
Groups: Member

Joined: 3/11/2014
Posts: 57
it may very well be the case that once you are running in a container, there is no way to add fonts.
my solution involves adding on to the base image so the fonts are there when the container starts.
this may or may not be appropriate for some use cases, and it may or may not work for everybody.

i understand what you're saying about the chromium situation. i'm just saying that's why it took me 4 days and this support interaction to figure out what the problem even was. Once i knew that, the solution was easy.

I found eo_support to be very helpful and responsive, so no complaints there.

One thing i strongly recommend, and maybe that can be addressed by EO, is to provide a thread-safe DebugConsole mechanism on HtmlToPdf, as i mentioned in an earlier message.

thanks again
roger
eo_support
Posted: Thursday, February 20, 2020 3:32:21 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 22,273
Our solution was similar. We use COPY directive inside Dockerfile to copy the fonts over and that works too.

As to the Chromium situation, I hear what you are saying. I wish there was a faster way. But this is in fact what we do every day. Numerous times we spend days and nights debugging into Chromium engine and only to find something so simple. As we spend more and more time debugging the browser engine, we do get better and better at navigating it and narrow down the problems quicker. We appreciate you very much for bearing with us in this process and also working and sharing your findings with us.

We will look into the DebugConsole issue. At present we are not planing to have a separate DebugConsole for each thread because generally a debug console catches output from all threads (think of Visual Studio's debug console). So if we switch it to a per thread model, then then we will have other user asking us to output all debug info to a single console. But we maybe able to add some kind of locking/filtering mechanism so that you only get what you need.


You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.