Welcome Guest Search | Active Topics | Sign In | Register

Filling in form field with a Swedish characters (or how to set encoding?) Options
Maxim
Posted: Monday, August 17, 2015 11:55:07 AM
Rank: Advanced Member
Groups: Member

Joined: 12/18/2013
Posts: 67
Hi,

I've found a strange behavior in a setting form field value function. When I set English value to the form field, then it's all good. But if I try to set a value with Swedish characters, then they are not added correctly.

This is the way I add the value:

Code: C#
var pdfPath = @"D:\...\FileToBeFilledIn.pdf";
var eoPdf = new EO.Pdf.PdfDocument(pdfPath);
eoPdf.Fields["Field name"].Value = "Då Dä Kö Råd Föl Yes!";
eoPdf.Save(@"D:\...\FilledInFile.pdf");


What I get is the following. When I open the document with Adobe Reader I see this text in the field: "D D K Rd Fl Yes!". See how Swedish characters are not shown? If I click in the field to change it, then I see "D[] D[] K[] R[]d Föl Yes!", where [] means square. So looks like there are some problems with encoding.

Now we come to the interesting part. If I select "D[] D[] K[] R[]d Föl Yes!" text using mouse, copy it and then paste into another field in this document, Swedish characters ARE shown there.

If I open the document using Foxit PhantomPDF, then the field has this value: "D D K Rd Fl Yes!". If I click on the field, then the text is changed to "Då Dä Kö Råd Föl Yes!" (so to the right text), but when the field looses the cursor (I click somewhere else), Swedish characters are not shown again.

And if I open the document in MS Edge, then it also doesn't show Swedish characters in the field, only "D D K Rd Fl Yes!".

What is the problem? Maybe somehow the encoding needs to be set? I tried PdfSharp library on the very same document and the field gets Swedish characters and shows them without problems. But we use your library in the project, so I don't want to switch to another solution.

Could you tell me what's wrong in my code?

Thanks in advanced,
Maxim
Maxim
Posted: Tuesday, August 18, 2015 1:49:04 AM
Rank: Advanced Member
Groups: Member

Joined: 12/18/2013
Posts: 67
I sorted it out myself. For those of you who have the same problem and don't know how to fix it, here is the solution:

Code: C#
var field = pdf.Fields[name];
field.Value = value;
field.Font.Name = "Arial"; // <== fixes the problem
eo_support
Posted: Tuesday, August 18, 2015 6:00:11 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,071
Thank you very much for the update! That does make sense. By default Windows uses font substitute for characthers that it does not have font data for. However the same substitution logic does not occur inside PDF. "Arial" font covers almost all characters. So setting font name to "Arial" would fix the problem. Thanks again for sharing!
Maxim
Posted: Monday, February 26, 2018 10:07:03 AM
Rank: Advanced Member
Groups: Member

Joined: 12/18/2013
Posts: 67
Hello.

Unfortunately setting the font to Arial fixes the issue but introduces another problem in default Google Chrome PDF viewer. Almost all our clients use Google Chrome, so we can't tell them that they should use another browser.

Please read the issue description, hope there is a fix for this *without* product update, because we get another issue not directly related to EO.Pdf with newer versions of the library (and no, it's not because we don't want to upgrade the license, we did it several months ago, so it's not a problem). But if the only fix is to use a new version of the EO.Pdf, then we'll have to update it.

We use version 17.2.43.0. The latest version 18.0.70 works the same related to this issue.

I found this topic about EO.Pdf charset where you had to release a new build to fix the issue with Portuguese and Greek characters, maybe it's the same fix for Swedish is needed.

As you can read in previous posts in this topic I originally run into this issue when tried to insert Swedish characters into a field. And setting Arial font on the field fixed this issue (the characters started to be shown in Google Chrome).

But it turned out that if you try to write/paste Swedish characters AFTER it into this field in Google Chrome (for example our clients want to change the PDF after we generated it), all Swedish characters (å, ä, ö) and some English characters (e.g. q, y, u, k, z, x etc) are not shown at all then (are replaced with space) or replaced with another character at all (for example "W" is replaced with "t").

I created a PDF file which you can fill in using this code (for example in LinqPad) and see the issue, you can download the file from this link. If you want to get it by email, please let me know.

Code: C#
var file = @"C:\Dev\Swedish characters problem.pdf";

var pdf = new PdfDocument(file);

var name1Field = pdf.Fields["Client name 1"];
name1Field.Value = "Åsa Dägström";

var name2Field = pdf.Fields["Client name 2"];
name2Field.Font.Name = "Arial";
name2Field.Value = "Åsa Dägström";

pdf.Save(@"C:\Dev\Filled in.pdf");


Then you can try to edit the original PDF file in Google Chrome (before it was filled in with EO.Pdf) and you'll see that there are no issues which happen in the PDF field after EO.Pdf changes their font to Arial. So after EO.Pdf sets the font to Arial, Google Chrome can't fill in this field without issues.

So in our case changing the font to Arial fixed one issue and but added another issue :(

Could you please have a look at this? Is it possible to set some settings in code during generation to workaround it or is there something you need and could fix in the next build?

Thank you,
Best regards,
Maxim
eo_support
Posted: Monday, February 26, 2018 10:37:23 AM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,071
Hi,

We did investigate this issue after your original post and this is a known limitation of the PDF file. Please try to use one of these fonts and see if it works: Times-Roman, Helvetica or Courier.

The root of the problem is font subset. To make PDF file portable, PDF file has font data for all characters used in the PDF file directly embedded in the PDF file. So for example, if a PDF file contains only three letters "A', "B" and "C" and the font is "Arial", then the font data for these three letters are embedded in the PDF file and even if the target computer does not have "Arial" font, the file can still be displayed correctly. This also means that the PDF file has no font data for letter "D".

This poses a problem for input fields ---- user can enter characters that weren't already in the PDF file. When that occurs the newly entered character won't be displayed correctly because there is no font data for those new characters in the PDF file. Some PDF Viewer will ask the system to provide the font data in such cases and still display the characters correctly, but some other PDF Viewer won't since the PDF specification does not require this.

One exception to the embedding font data rule is the so called "standard 14" fonts. These are fonts that the PDF specification requires all PDF Viewer to be able to correctly display even if no font data has been embedded in the file at all. The standard 14 fonts are in fact just three fonts (as listed at the beginning of this post) because it count each variations as a different font (for example, "Courier" and "Courier Bold" are two different fonts) and also includes symbol fonts. As such it is recommended to set input fields to those fonts.

Please let us know if this resolves the issue for you.

Thanks!
Maxim
Posted: Monday, February 26, 2018 11:42:50 AM
Rank: Advanced Member
Groups: Member

Joined: 12/18/2013
Posts: 67
Hi,

Thank you for the explanation!

Unfortunately using Times-Roman, Helvetica or Courier doesn't help us - then the Swedish characters are not shown in the input field UNTIL you put a focus into the field.

If I set any of the fonts you wrote and then set the value to the field using EO.Pdf and open PDF then, I don't see the Swedish characters in the field. But if I put a focus into this field (click into it), then the Swedish characters appear. And if I change the text anyhow (add space at least) in Google Chrome and put the focus into another field, then the Swedish characters added by EO.Pdf are shown. So the issue is in the initial view I would say. And then I can add other Swedish characters without any issues.

You can open this PDF file in Google Chrome to see what I describe above. Try to click into the first 2 fields from top with the values, the value will show the Swedish characters. And if you add a space or any other letter in a field and then click somewhere else, then right text will remain.

These 2 fields are filled in with this code:

Quote:
var pdf = new EO.Pdf.PdfDocument(file);

var name1Field = pdf.Fields["Client name 1"];
name1Field.Font.Name = "Helvetica";
name1Field.Value = "Åsa Dägström 1";

var name2Field = pdf.Fields["Client name 2"];
name2Field.Font.Name = "Courier";
name2Field.Value = "Åsa Dägström 2";

pdf.Save(filledInFile);


I guess it's not allowed to post the code using another PDF library on this forum, but if I fill these fields there without setting any font, then I don't have any issues. If you would like to have a look at it, please tell me where should I send it.

So maybe it's something which can be fixed in the EO.Pdf? Or if not, what else can I try could you please suggest?
eo_support
Posted: Monday, February 26, 2018 5:03:57 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,071
Hi,

You can send the test file generated by other PDF library to us and we will look into it to see what's the difference. See here for how to send test files to us:

https://www.essentialobjects.com/forum/test_project.aspx

Thanks!
Maxim
Posted: Tuesday, February 27, 2018 3:22:35 AM
Rank: Advanced Member
Groups: Member

Joined: 12/18/2013
Posts: 67
Hi,

I emailed to the files, please have a look!

Thank you,
Maxim
eo_support
Posted: Tuesday, February 27, 2018 10:40:19 AM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,071
Thanks for the test files. I believe we have found the root of the problem and we will be working on an updated build. We will reply here again as soon as the new build is posted.
Maxim
Posted: Tuesday, February 27, 2018 10:53:40 AM
Rank: Advanced Member
Groups: Member

Joined: 12/18/2013
Posts: 67
Hi,

Sounds good! Waiting for an update from you then!
eo_support
Posted: Tuesday, February 27, 2018 3:57:33 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,071
Hi,

This is just to let you know that we have posted a new build that should fix this problem. You can download the new build from our download page. Please take a look and let us know how it goes.

Thanks!
Maxim
Posted: Wednesday, February 28, 2018 6:34:20 AM
Rank: Advanced Member
Groups: Member

Joined: 12/18/2013
Posts: 67
Hi,

Thank you for the new build.

It fixed the second issue I wrote about in email (which you called "Fixed setting the value of a form field that is not explicitly associated to a page in the PDF file has no effect issue" in your change log), but the first issue (the one this topic is about, you called it "Fixed using non-standard characters in form field value causing those characters not properly displayed issue" in the change log) is not fixed. I see some work was done (now Swedish characters are shown, but without spaces it seems), but it doesn't fix the issue completely.

Please see this code:

Quote:
var pdf = new EO.Pdf.PdfDocument(file);

var name1Field = pdf.Fields["Client name 1"];
name1Field.Font.Name = "Helvetica";
name1Field.Value = "Åsa Dägström 1 (with Helvetica font set in code) Depå utanför försäkring";

var name2Field = pdf.Fields["Client name 2"];
name2Field.Font.Name = "Courier";
name2Field.Value = "Åsa Dägström 2 (with Courier font set in code) Depå utanför försäkring";

var comment1Field = pdf.Fields["Comment Helvetica"];
comment1Field.Font.Name = "Arial";
comment1Field.Value = "Åsa Dägström 3 (with Arial font set in code)";

var comment2Field = pdf.Fields["Comment Arial"];
comment2Field.Value = "Åsa Dägström 4 (no font set in code)";

pdf.Save(filledInFile);


The result is that "Åsa Dägström 1 (with Helvetica font set in code) Depå utanför försäkring" and "Åsa Dägström 2 (with Courier font set in code) Depå utanför försäkring" texts miss many spaces UNTIL the focus is added into the field. Have a look at this image please or the PDF file. Google Chrome and Foxit PhantomPDF and Adobe Acrobat Reader show these values with the same issue.

Now after your explanation above I understand that I must have Helvetica font for Swedish files (and this is why Swedish PDF files usually have it set by default).

Could you please have a look at this again?

Thanks,
Maxim
eo_support
Posted: Wednesday, February 28, 2018 9:14:08 AM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,071
Hi,

We tested your new code and we do not see any problems here with result PDF file. Can you email us a few screenshots along with test files so that we can investigate further?

Thanks!
Maxim
Posted: Wednesday, February 28, 2018 9:18:10 AM
Rank: Advanced Member
Groups: Member

Joined: 12/18/2013
Posts: 67
Maxim wrote:
Have a look at this image please or the PDF file. Google Chrome and Foxit PhantomPDF and Adobe Acrobat Reader show these values with the same issue.


Is Dropbox blocked in your environment? I gave a link to the printscreen in the previous post (_this image_ text is a link) and a link to PDF (_PDF file_ text is a link).

I'll of course send these files (images and PDF) via email now.

Thank you for trying to solve the issue!
eo_support
Posted: Wednesday, February 28, 2018 9:39:30 AM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,071
Yes. We see the problem now. Sorry that we missed them in your previous post. We are looking into this and will reply again as soon as we find anything.
Maxim
Posted: Wednesday, February 28, 2018 9:52:40 AM
Rank: Advanced Member
Groups: Member

Joined: 12/18/2013
Posts: 67
I've just also sent the video (less than 3 minutes) showing the whole process of file building and the issue to you support email.
eo_support
Posted: Wednesday, February 28, 2018 7:23:20 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,071
Hi,

We tested your code here and everything to work fine. There might be an issue with the space character in your source code. There are many different space characters in unicode:

http://jkorpela.fi/chars/spaces.html

The one you have in your source code maybe one that is not recognized by the font. To see if this is the problem, try to delete the space in question, and copy and paste a "good" space character from somewhere else (for example, for the value of the third and fourth fields) and see if that works.

If the problem continues, please try to create a complete test project and send the test project to us in zip format. We will try to run that here again to see if we can find anything.

Thanks!
Maxim
Posted: Thursday, March 1, 2018 3:18:26 AM
Rank: Advanced Member
Groups: Member

Joined: 12/18/2013
Posts: 67
Hi,

I created a console app which generates the file with EO.Pdf and another lib I wrote you about by email. I've just sent it to your support email.

I now put the SAME value into ALL 4 fields in the PDF template. And the SAME space is used between the words in the value which is then inserted.

The space then is NOT shown near the words which have Swedish letter, but IS shown after the words with English letters only. This happens in the fields with Helvetica/Courier fonts which as I now know must be used in Swedish files. The field with Arial font doesn't have this problem. But Arial can't be used, because it's not the font from the "standard 14" fonts.

I also sent a new video of the issue (19MB) via the second mail. If you watch it, it's 100% clear that there is an issue. I show there that the same space is used in the value which is inserted then into the fields. And some spaces are shown and some are not. Another lib which is used in the test project as well inserts the value correctly.

Please have a look at this.

Thank you!
eo_support
Posted: Thursday, March 1, 2018 1:14:30 PM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,071
Thanks for the test file. I believe we have found the root cause and we will be sending a test build to you either tomorrow or early next week for you to verify it on your end.
Maxim
Posted: Thursday, March 1, 2018 1:35:17 PM
Rank: Advanced Member
Groups: Member

Joined: 12/18/2013
Posts: 67
Hi,

Sounds great! Thank you!


You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.