Issues in PhilosophyEliminating Footnotes Makes Philosophy More Accessible

Eliminating Footnotes Makes Philosophy More Accessible

It’s 2019. Computers can drive cars, operate stores, and outperform humans in sophisticated games. However, computers cannot correctly read a PDF with footnotes. Alas, many people have to rely on their computers to read PDF papers. So, many people face significant obstacles while trying to consume written research. That seems bad. Insofar as we can prevent this bad outcome without making matters worse, we should. 

1. What it’s Like to Listen to a PDF

For those who are not familiar with auditory reading, imagine that you want to read a paper, but you cannot—for whatever reason—read the paper visually. You can, however, listen to it. So, you open the document on your computer, smartphone, or tablet and begin using standard text-to-speech features. As it reads the abstract, you think, “The software voice is a bit robotic, but I’m comprehending everything. This is great!” However, as the device begins reading the hook of the paper, it stops (mid-sentence) and starts reading something else entirely:

Accepted for publication on Such-and-such. Funding from So-and-so. Page Such-and-such. This content downloaded from 13.421.41.147 on Such-and-such a day at Such-and-such a time. All use of this document is subject to https://journal.org/terms. Journal of Such-and-such. Volume this. Issue that. Footnote 1. The authors would like to thank a long list of people…

1.1  What just happened?

Text-to-speech software reads all of the text on a page before it moves on to the next page. So, when a sentence spans from one page to the next page, the text-to-speech stops—often mid-sentence—to read the not-yet-read text in the margins, header, and footer (Figures 1 and 2).

Figure 1
Figure 2

1.2  Back to the Imaginary Case

When the text-to-speech software resumes reading the sentence that spanned the first and second page, you no longer remember the first part of the sentence. While you try to remember it, the text-to-speech software is still reading. So, when you finally remember what was going on prior to the interruption, the text-to-speech is most of the way through the second page of the paper. You desperately try to imagine what you missed from the second page of the paper while also trying to pay attention to what you are currently hearing. Then, another page break interrupts you to tell you about another copyright notice, paper title, footnote, page number, etc.

Arrgh!

That is what it is like to listen to PDF documents using the text-to-speech software available to the average consumer: it’s cognitively taxing and likely to cause gaps in comprehension.

2.  The PDF Problem

Some books are available in digital form. And some digital books are amenable to auditory reading — e.g., EPUBAmazon’s Kindle formatApple’s iBook format, etc. And some academic books are accompanied by a proper audiobook — e..g, Amazon’s audiobooks, Apple’s audiobooks, etc. These digital book formats are great for auditory reading (even if they make it difficult to cite direct quotes).

2.1  Articles vs. Books

However, research articles and chapters are rarely available in formats that are amenable to auditory reading. They are almost exclusively available in portable document format (PDF). And PDF documents are more difficult for software to read correctly. Even devices that are known for their accessibility features—like Apple’s devices—cannot reliably read a PDF, start-to-finish, without some kind of problem.

2.2  Scanned PDF documents

When you scan a book or paper, the scanned file usually lacks text-encoding. It’s just a picture of each page. So even though you can see and read symbols on the PDF document, your device cannot. To allow your device to see and read the text, you need optical character recognition software (OCR). OCR software finds the symbols on the PDF image of each page and encodes the text information into the PDF file. (Alas, good OCR software is expensive — e.g., Adobe Acrobat Pro comes with OCR, but Adobe products are infamously expensive. And affordable OCR tools are comically bad. So unless you have lots of scanned PDF documents, OCR software is probably not worth the investment.)

3.   The Non-Body Text Problem

Lots of academic PDF documents have marginal text around the body text. E.g., many academic PDF documents contain copyright information in the margins, journal information in the header, and paper information in the footer.

All of this non-body text is confusing to text-to-speech software. Text-to-speech software just reads the text it finds. It cannot tell the difference between body text, margin text, header text, and footer text. So, text-to-speech software just reads all of the text on each page in some order or another. This is why text-to-speech often jumps back and forth between main body text to non-body text—sometimes mid-sentence.

4.  A Quasi Solution Hack

I have found only one developer that has created consumer-ready software that can ignore header and footer text. Their Mac app is vBookz PDF Voice Reader and their iOS app is Voyzer Voice Reader.†† (After asking many colleagues for years, I still do not know of Windows or Android apps that can do this). Alas, these apps only ignore headers and footers when users can manually indicate the size of document’s headers and footers (Figures 3, 4, and 5).

Figure 3
Figure 4
Figure 5

While this header and footer cropping solution hack is helpful, it does not overcome the problems posed by marginal text or footnotes.

5.  The Footnote Problem

Footnotes are just one kind of non-body text. However, footnotes pose a unique problem for text-to-speech software. This is because footnotes tend not to be of equal length and tend not to be equally distributed throughout papers. As a result, the size of the footers varies from page to page in most papers with footnotes (Figure 6). So, even if one manually crops out some portion of pages in a PDF document in apps like vBookz and Voyzer, they will erroneously crop out some body-text and fail to crop out some footnote text.

Figure 6: footnotes resulting in different size footer on each page

Of course, footnotes are not the only method noting method. There are also endnotes. Importantly, endnotes do not confuse text-to-speech software because endnotes do not live variable-sized footers under a paper’s body text. Rather, endnotes live at the end of the paper. So, text-to-speech software can read a paper without being distracted by endnotes. Further, text-to-speech software can read endnoted PDF papers without the need for users to manually crop out footers for every PDF paper that they read!

6.  Why This Matters: Accessibility

Lots of people need auditory reading. Obviously, some of these people are unsighted people who simply cannot read visually. Perhaps less obviously, lots of sighted people also need auditory reading. These peoples’ lives might involve great deal of child-caring, commuting, errand-running, home repairing, etc. These tasks make it nearly impossible to stare at pages of text for extended periods of time.

An Accessibility Argument

Given what I’ve said so far, we now have the basis for an accessibility argument for moving away from PDF documents that contain marginal text and footnotes toward their alternatives—other document formats, endnotes, or—wait for it—no notes. The accessibility argument could go something like this:

1. Publicly funded research should be more accessible, where possible.

2. Non-PDF documents without marginal text and with no notes (or endnotes) are more accessible than PDF documents with footnotes or marginal text.

3. It is possible to publish publicly funded research in non-PDF without marginal text or notes (or with endnotes).

Conclusion: Therefore, publicly funded research should be published in non-PDF without marginal text or notes (or with endnotes) rather than PDF with marginal text or footnotes.

In my experience, most people will grant premise 1. I’ve given some reasons to accept premise 2. The aforementioned existence of non-PDF documents without marginal text or endnotes seems to prove premise 3. So, until it is clear that premise 2 is wrong, something like the above conclusion seems to follow.

7. Concerns

I’ve encountered a few reasonable concerns about this proposal. I imagine that these concerns will be shared by some of the people reading this. So, I will address some of these concerns here.

7.1  Actually, footnotes are easier

Some are resistant to abandoning footnotes. They appreciate how footnotes put footnoted information on the same page that the footnote is referenced. Thus, footnotes are easier to reference than endnotes. (Thanks to Danielle M. Wenner for making a point like this.) In reality, footnotes are easier to reference only for visual reading. So, if everyone had equal opportunity to read visually, then this point would seem to undermine the accessibility argument against PDF, marginal text, and footnotes. However, it is not at all clear that there is equal opportunity for visual reading. Further, it is not obvious that the convenience of footnotes for visual reading outweighs the inconvenience of footnotes for auditory reading. (E.g., how do we weight the convenience, or lack thereof, of visual reading and auditory reading? What are the base-rates of visual and auditory reading, respectively?) So, one needs to do more than merely point out the contingent convenience of footnotes to defeat the aforementioned accessibility argument.

7.2  Why not just improve text-to-speech software

As I mentioned at the outset, it’s 2019. Software should be able to handle the problems posed by PDF documents, marginal text, and footnotes. It is not the responsibility of writers, editors, and publishers to upend their practices because software engineers have yet to optimize auditory reading. (Thanks again to Danielle M. Wenner for making a point like this.) In reality, the software solution does not preclude the need for the solution I’ve proposed. After all, even if we develop text-to-speech software that overcomes the problems posed by PDF, marginal text, and footnotes, there is no guarantee that the software will be widely available to the average research consumer. So, I hope that both software-side and publishing-side solutions are attempted simultaneously.

7.3  This doesn’t solve larger accessibility problems

There are many (many!) problems with the accessibility of academic publications. For instance, tons of publicly funded research is neither accessible for auditory reading nor accessible for visual reading for the average person. Rather, it’s behind paywalls. So, even if we grant the problems posed by PDF, marginal text, and footnotes, it is not clear that we should prioritize them over the larger problem(s). (Thanks to Wesley Buckwalter for making a point like this.) This seems right. But let’s be careful here. Like some of the larger accessibility problems with academic publications, progress can be made at the individual level (e.g., writers and editors) as well as institutions (e.g., journals, publishing companies, etc.). The existence of larger problems does not preclude the possibility of (or need for) making progress as individuals and institutions.

Related Posts

Nick Byrd

Nick Byrd studies cognitive science of philosophy and philosophy of cognitive science. He is particularly interested in how differences in reasoning (like intuition vs. reflection) relate to differences in judgment and decision-making. You can follow this work on and on the major social media platforms.

4 COMMENTS

  1. You are raising significant issues worth considering in this post. However, you’re not yet close to anything like a workable proposal. I’m sure there are many more issues at play here, but I’ll mention two:

    1) Archiving. PDF is one of the few formats around that is a plausible archival format. Ebook formats are veering towards variations on HTML but still have private and transient markup features. Accessibility in the general sense is also a temporal issue: no one is served if documents aren’t understood after a decade or two.

    2) Citation. Some authors don’t use sections, or don’t use them often. At present the most common convention by a wide margin is to cite by page number. There are some ad hoc conventions to work around this problem for specific formats, but as yet those have low adoption. Merely switching to other formats doesn’t solve this problem. And offering alternative formats does not address the problem of accessibility to citation.

  2. So I have a lot of personal experience with these problems. I am a philosophy graduate student with fairly severe dyslexia. As such, I need to have my computer or phone read almost all the philosophy I consume aloud to me, and the footnote issue is a very large problem. There are ways around it (I have long since bought expensive OCR software that I can use to process PDFs in a way that lets me somewhat quickly remove footnotes/other page markings), but they are tedious and I end up having to spend quite a bit of time making each document accessible.

    However, I think its worth noting that footnotes are not the only problem with auditory accessibility, and the solution Nick proposes (while it would be much nicer than the status quo, creates problems of its own).

    First, there are issues with intext citations. If i’m listening to a document and come across the following: “While such theories are increasingly influential (Shanton and Goldman, 2010; De Brigard, 2014a; Michaelian, 2016b), the older causal theory of memory Martin and Deutscher (1966) remains popular (e.g., Bernecker, 2010; Klein, 2015; Cheng and Werning, 2016; Debus, Forthcoming).”
    It becomes extremely difficult to follow what is going on, even while reading along. For articles which have a lot of intext citations, I will often use regular expressions to remove everything in parenthetical, which obviously creates problems of its own.

    Second, there are issues with symbols, formalism, paragraph structures, text emphasis etc. which computers have a difficult time reading aloud. Computers often wont distinguish asubscript from a, so sentences which use tools like that to make contrasts are extraordinarily confusing to listen to. They will also have difficulty with Greek or Latin letters, quotes in other languages, cases where the author is using paragraph breaks to clarify structure (such as when premises are only differentiated in list form and not by independent numbering), computers also don’t emphasis things authors put in bold or italics which often changes the meaning of a sentence etc.

    Third, when I do go through and remove footnotes, it makes it difficult to then be confident I am getting everything I need from the text. As such, I often need to go back through and listen to just the footnotes, which requires switching between documents as I listen.

    If switching to endnotes is all the philosophical community can do, I’ll take it, don’t get me wrong. But, in my experience, there are two proposals which I think would do a lot more for helping accessibility.

    The idea situation would be if journals provided downloadable audio recordings along with journal articles (or if authors had recordings on personal websites, or if journals had podcasts with the articles you could subscribe to etc.). Listening to an author read their own paper would be hugely helpful. Authors would know what content to skip (such as intext citations) and what content not to skip. They would be able to say things like “as an aside, to indicate when they are going to actually read the footnote and put the aside at the end of the sentence the footnote applies to, rather than the end of the page”, they would be able to use pauses and emphasis in helpful ways etc. etc.
    I would MUCH rather than listen a poor audio quality of an author reading their own paper (even if in the middle of the recording I hear the author get tongue tied and need to restart a sentence) than listen to my computer read the paper aloud to me.

    I don’t actually know how reasonable that would be. I’m inclined to think the amount of time it would take to make a quick recording of an article you have written would be a marginal increase in the time it takes to write and edit a paper, and departments might even be able to find money to pay grad students to do recordings if it were burdensome. But, I also recognize this would require quite a bit more total work for every article published, and it would not actually be helpful for all that many people in philosophy.

    The less ambition solution, which I would still find wondefully helpful, is if we could demarcate content that is not to be read in a systematic way. If a journal had all intext citations, footnotes that need not be read, etc. demarcated (so suppose in-text citatoins look like /s/(…)/e/ and footnotes look like /s/1…./e/ then it would be very easy to use regular expression to tell my computer to skip any text between /s/ and /e/. Obviously we could have two versions of the paper online. One with footnotes and without weird markings, and one designed for having your computer read aloud. I would not think that this change would require very much work at all, but even that would help a lot.

  3. I recently noticed that Taylor and Francis had added a ‘Listen’ button at the top of each article in their journals (on the HTML version of the page). After reading this post, I decided to test it out on one of my recent articles (https://www.tandfonline.com/doi/full/10.1080/09608788.2018.1509294). It turns out that the thing just stupidly reads the full content of the page (including navigation links, etc.) and interrupts the main text to read the footnotes. Given that the text to speech is processing and HTML version of the page, it ought to be really easy for them to just read out the main body text, but for some reason they haven’t done this.

    I have to admit, this problem had not occurred to me before.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

WordPress Anti-Spam by WP-SpamShield

Topics

Advanced search

Posts You May Enjoy