It’s time for scholarly publishing to get accessible, writes Bill Kasdorf
The case for making publications accessible is so obvious and has been made so often that I won’t waste time here setting out those arguments. You know that accessibility is the right thing to do.
What you may not know is that making a publication accessible has recently become a whole lot more straightforward – and that your publications today are closer to being made properly accessible – than you realise.
The time has come for scholarly journal articles, my focus in this essay, to get accessible. This is important content!
It is particularly regrettable that scholarly journal articles are rarely published in properly accessible form. Although the audience for a given article is generally small, due to its focus on a highly specialised topic it is not, like more ephemeral or entertaining literature, incidental or optional. The researchers, educators and students who need a given article really need it: it’s not a take-it-or-leave-it proposition. No substitutes are acceptable.
This becomes particularly apparent in the context of higher education. Most colleges and universities have what are typically called disabled student services (DSS) offices. These are small teams, often composed of a tiny staff supplemented by interns, whose job it is to provide an accessible version of resources assigned in a given course when a given student can’t access the standard print or online version that all the other students use. The institutions are legally obligated to do this. And they’re prevented from sharing the results.
It’s called remediation, and it’s a shockingly manual and laborious process because the files the DSS office has to work with are so inadequate. I recently asked Jamie Axelrod, head of the DSS office of Northern Arizona University and the president of the association of such offices in the United States, how significant a problem journal articles are. It’s huge, he said: about half of the publications they have to remediate are journal articles. I asked him how much work it would save them if those articles came in the form of EPUB 3. Again, huge: compared to what they usually have to work with, namely a PDF, getting a properly tagged EPUB 3, even if not fully accessible, would reduce their work by 80 per cent for most articles, and 50 per cent even for STM articles.
What? PDFs aren’t what’s needed?
Well, truth be told, DSS offices typically do want to get PDFs because the alternative is usually print, which they have to scan, clean up, and tag by hand. At least getting a PDF generally gives them usable text. This falls into the “better than nothing” category. Publishers don’t realise how much remediation can be required of their PDFs.
The current baseline standard format for the interchange of accessible content is EPUB 3, as specified by EPUB Accessibility 1.0, issued in January 2017.
The key requirements:
• It’s based on web accessibility specifications, e.g., Web Content Accessibility Guidelines (WCAG), which is the foundation for most accessibility specifications globally, and WAI-ARIA, which enables the addition of semantics to content to guide assistive technology;
• An accessible EPUB is a fully conformant EPUB 3, the current e-publication standard;
• It requires proper structural semantics, namely HTML5 markup; this generally provides most of the required semantics, minimising the need for additional ARIA markup;
• It adds features required for publications, which are often more complex than web pages (e.g., they contain multiple content documents and other resources), such as a default reading order and the ability to navigate between documents;
• It requires image descriptions;
• Math should be provided as MathML; and
• There’s a bit of metadata required to document the accessibility.
These features enable assistive technology to properly render the publication for print disabled users. Obviously there are other requirements for more complex content; educational platforms, for example, need to deal with complicated multimedia and interactive content.
But scholarly journal articles are actually relatively simple. They have a very straightforward and well-known structure. They have rich metadata. They usually consist of only a single document. They’re typically produced by an XML-based workflow that enables the structural components of the article to be easily distinguished and tagged, and for those that have math, the vast majority of the equations are MathML at some point in the workflow.
This is mostly what we do already! Accessibility currently suffers from the misconception that it requires special workflows, special formats, special expertise and tons of extra effort. Those things used to be true.
But today, the requirements for accessibility are based on standards and specifications that are commonly used in the publishing ecosystem in general. The biggest example is that what is known as ‘DAISY’ – which was a unique format and mark-up scheme developed by the DAISY Consortium to facilitate the use of assistive technology – has been replaced by EPUB. DAISY now recommends EPUB 3, as described above, instead.
What this means is that conforming to those requirements is suddenly much easier. Most vendors who produce the vast majority of scholarly journal content are familiar with HTML markup (and how to convert to HTML from the JATS XML most journals use). They are also familiar with MathML; most of them use MathML as the basis for how they create math, even if they don’t deliver the MathML to their customers. And those same vendors also routinely produce EPUB 3 files by the thousands.
But not for journal articles. While EPUBs are now routinely produced for books, and while most educational content is now produced by or delivered by platforms that are designed to align with EPUB 3, journal articles have been stuck in PDF or plain HTML.
This is about to change. A case in point: Atypon, one of the major journal hosting services (based on their Literatum platform, which is also the basis for many publishers’ platforms), is integrating a modern EPUB reader into the next release of Literatum, and plans to enable journal articles conforming to its standard requirements (their JATS XML spec) to be automatically created as EPUB as an available deliverable. Those EPUBs are even expected to include MathML, because all the math in Literatum is available as MathML. This means that potentially millions of journal articles will soon be able to be made available as EPUB 3.
This was not done in the interest of accessibility. I should say at this point that I have no connection with Atypon; they are not a client of mine. But when I heard what they were up to, I was thrilled, and told them so. I pointed out to them that this meant that many of those articles would now be accessible, and others would need just a few more steps to become properly accessible. They hadn’t realised that. When I told Jamie Axelrod about it, he was thrilled. Just getting those EPUBs, instead of PDFs, will make a huge difference.
Accessibility should be mainstream – and it almost is
Why is it important that this development from Atypon amounts to almost accidental accessibility? Because it indicates how mainstream accessible publishing should be. How mainstream it is.
Aside from some simple metadata, image descriptions are the main missing aspect. This is a big issue in general, because creating image descriptions has not been part of normal editorial and production workflows. It turns out that there’s more to it than there seems to be: subject matter expertise is often required, because the image description shouldn’t just say what the image is, it should say what it is trying to convey in the publication.
For most types of publishing, this is still a challenge. But it should not be such a challenge for scholarly journals. Why? Because the requirements for submission of articles for publication in scholarly journals are far more comprehensive and systematic, and require much more of authors, than for almost any other sector of publishing. It should not be difficult for journals to add the requirement that authors submit an image description for each image. As a bonus, those descriptions can make the images and the article more discoverable.
And especially for a scholarly or research article, who knows better what that image is intended to convey? I would expect that authors would insist on having control of image descriptions. While the descriptions will no doubt require some copyediting, and while editorial staff will need to understand what constitutes a good image description, getting the image descriptions from the authors in the first place is key. Let’s do it.
I don’t think the day is too far off when providing image descriptions will be routine for scholarly journal articles, when manuscript submission and peer review systems will incorporate them routinely, when funders will routinely require them.
We’ve already got the MathML. We’ve already got the rigorous structure and rich metadata. We know how to make EPUBs. We already know how to do all this. So let’s do it.
There is no reason anymore why scholarly journal articles, already ‘born digital’, can’t be ‘born accessible’.
Bill Kasdorf is principal at Kasdorf & Associates