Welcome Guest Search | Active Topics | Sign In | Register

PDF Accessibility, round 2 Options
David A
Posted: Tuesday, August 8, 2023 3:39:47 PM
Rank: Newbie
Groups: Member

Joined: 2/23/2022
Posts: 9
Good day,

The last time we addressed this topic, you had made some excellent improvements to support tagging in PDFs and to get it to pass Adobe's accessibility checker. Now, I have been instructed to meet the standards as defined in the "official spec". To do this, I downloaded the PAC 2021 tool from https://pdfua.foundation/en/pac-download/ and test again. While I tried to solve the issues in HTML, it does seem that some additional work needs done.

In summary, the few issues I've found are primarily that borders, bullets, and underlines aren't tagged. Two other issues are that images aren't wrapping in a bounding box (BBox?) and the hyperlink title isn't being considered as an alternative text.

I downloaded and slightly modified a test page from a Github project that can illustrate the different elements that need corrected. I will post the HTML at the end of the message. Thank you for the care you've put into adding this functionality and I look forward to hearing your response.

da

Example HTML:
<html lang="EN-US">
<head>
<title>All-in-one PDF/UA Testcase</title>
<meta name="subject" content="PDF/UA all-in-one" />
<meta name="author" content="openhtmltopdf.com team" />
<meta name="description" content="An example containing everything for easy testing" />

<bookmarks>
<bookmark name="Simple Paragraphs" href="#para" />
<bookmark name="Lists" href="#lists">
<bookmark name="Ordered" href="#ordered" />
<bookmark name="Unordered" href="#unordered" />
</bookmark>
<bookmark name="Images" href="#images" />
<bookmark name="Links" href="#links" />
<bookmark name="Tables" href="#tables" />
<bookmark name="Backgrounds" href="#backgrounds" />
<bookmark name="Conclusion" href="#conclusion" />
</bookmarks>

<style>
body {
margin: 20px;
font-family: 'arial';
/* Font provided with builder. */
font-size: 15px;
}
</style>
</head>

<body>
<h1 id="title">All-in-one accessible (PDF/UA, Section 508, WCAG) PDF example</h1>

<h2 id="para">Simple paragraphs</h2>

<p>Paragraph one. Some text that goes over multiple lines. OK, this is getting to the required length. Need another
sentence to get there in the end.</p>
<p>Paragraph two. Some text that goes over multiple lines. OK, this is getting to the required length. Need another
sentence to get there in the end.</p>
<p>Paragraph three. <span style="font-weight: bold;">Some text in a span that's bold that goes over multiple lines.</span> OK, this is getting to the required length. Need another
sentence to get there in the end.</p>

<div>
Some text in a div with spans in the middle. <span style="font-weight: bold;">Some spans will be bold</span>, <span style="text-decoration: underline;">some underline</span>, <span><em>and some italic with an em tag</em>.</span>
</div>

<h2 id="lists">Lists</h2>

<h3 id="ordered">Ordered</h3>
<ol>
<li>One</li>
<li>Two</li>
<li>Three</li>
</ol>

<h3 id="unordered">Unordered</h3>
<ul>
<li>Bullet item one</li>
<li>And two</li>
<li>And three</li>
</ul>

<h2 id="images">Images</h2>
<img src="https://s.yimg.com/rz/p/yahoo_homepage_en-US_s_f_p_bestfit_homepage.png" title="Logo" />

<h2 id="links">Links</h2>
<p>This is an external link to the project <a title="The homepage"
href="https://openhtmltopdf.com">homepage</a>.</p>
<p>This is an internal link to the <a title="Go to top" href="#title">top</a> of the document.</p>

<h2 id="tables">Tables</h2>
<table>
<caption>Simple table example with fake data</caption>

<thead>
<tr>
<th>Col One</th>
<th>Col Two</th>
</tr>
</thead>

<tbody>
<tr>
<td>One</td>
<td>Two</td>
</tr>
<tr>
<td>Three</td>
<td>Four</td>
</tr>
<tr>
<td>Five</td>
<td>Six</td>
</tr>
</tbody>

<tfoot>
<tr>
<td>Footer1</td>
<td>Footer2</td>
</tr>
<tr>
<td>Footer3</td>
<td>Footer4</td>
</tr>
</tfoot>
</table>

<h2 id="borders">Borders</h2>
<div style="padding:10px;border-top:1px solid black">
Top border
</div>

<div style="padding:10px;border-left:1px solid black;height: 50px;">
Left border
</div>


<h2 id="backgrounds">Backgrounds</h2>
<div style="background-color: lightgreen; height: 40px; border-radius: 10px; border: 1px solid gray;">
<p>Some text on a background. Remember to use a good contrast if using background colors.</p>
</div>

<h2 id="conclusion">Conclusion</h2>
<p>Remember to keep it simple for PDF/UA compliance.</p>

</body>
</html>
eo_support
Posted: Monday, August 21, 2023 9:18:26 AM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,423
Hi,

Sorry that it took a while for us to respond. We were trying to work through all the issues. We have been able to resolve most issues except for one. In our next build the following issues will be resolved:

1. Path object not tagged for link underlines;
2. Nesting link annotation inside link structure elements;
3. Figures missing bounding box;
4. Path object not tagged generated from CSS borders;
5. Nature language of "Contents" entries in annotations;

We have not been able to find a reliable way to resolve issues related to list number and bullets. The root of the problem is a unique internal "node ID" is needed to properly mark a segment of the output. However the list number and bullets are not real nodes, they are psuedo nodes created by the rendering engine thus does not have an unique node ID associated with it. Generating node ID for them would involve significant changes to the browser rendering engine which makes it not a viable solution. We will continue to work on this issue to see if we can find a workable solution.

Thanks
eo_support
Posted: Wednesday, August 23, 2023 9:53:04 AM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,423
Hi,

This is just to let you know that we have posted a new build (23.3.4) with the above changes.

Thanks!
David A
Posted: Tuesday, September 5, 2023 9:18:20 PM
Rank: Newbie
Groups: Member

Joined: 2/23/2022
Posts: 9
Good evening.

Sorry it took me a bit to get to testing this, but I have downloaded the latest version and I got it to pass on the test document, with the exception of the list items.

The plan right now is to go ahead with this and we can manually do some work to insert &bull; and 1., 2., etc. as needed.

I tried a couple of hacks to try and automate the process. One didn't work and the other sort of worked. I don't know if you want to try the ideas and see what if anything you can implement or maybe they can help others who need this functionality.

The first idea was using CSS to generate the bullet items. A few style rules like:

Code: CSS
ol.accessible,ul.accessible{
  list-style-type: none;
  counter-reset: listIndex;
}
ul.accessible li::before{
  content:"\2022\00a0\00a0";
}
ol.accessible li::before{
  counter-increment: listIndex;
  content: counter(listIndex) ".\00a0\00a0";
}


And then you would just need to add the class to the lists like:
Code: HTML/ASPX
<ol class="accessible">
<ul class="accessible">


This does render in HTML just fine, but has the same problem with not being tagged in the PDF. Makes me wonder if any CSS content: additions would have the same problem.

Failing that, I turned to JQuery and inserting the bullets in with code:

Code: JavaScript
<script src="https://code.jquery.com/jquery-3.7.1.js"cmt:1e39fd67-ba80-4b73-b59f-70a516172e7f--</script>
<script>
  $(document).ready(function(){
    $("ol").each(function(){
      var i=0;
      $(this).css("list-style-type","none");
      $(this).find("li").each(function(e2){
        i+=1;
        $(this).prepend(i + ".&nbsp;&nbsp;");
      })
    })

    $("ul").each(function(){
      $(this).css("list-style-type","none");
      $(this).find("li").each(function(e2){
        $(this).prepend("&#x25cf;&nbsp;&nbsp;");
      })
    })

  })
</script>


This works in both HTML and converted to PDF. The downside is that it changes everything on the page. I could modify the selectors to look for a class, but the intent was to be able to paste it into any page and have it work with no other refactoring.

But anyway, thank you for your attention to this feature. We're going to move ahead.
eo_support
Posted: Wednesday, September 6, 2023 9:27:00 AM
Rank: Administration
Groups: Administration

Joined: 5/27/2007
Posts: 24,423
Thanks for sharing. That makes perfect sense. "content" from CSS are also "psuedo nodes" and they are not visible through JavaScript. On the other hand the node that you explicitly inserted with JavaScript are real nodes. Based on the current implentation, only real node can have accessiblilty information associated to it and tagged.


You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.