Dear industry… how to help grad students learn


Saving the environment is `in’ these days, so I’ve been trying to make an effort to avoid unnecessary printing by reading more papers on my computer. I suspect that PDFs are becoming the standard for reading papers online, and I’ve been really impressed by the ‘hypertex’ features that allow hyperlinks within the document and to the arXiv. Such links are built into a LaTeX document class (such as JHEP), requiring nearly no extra effort from the author. The resulting PDF is pleasantly clickable and makes it easy to pull up references; it even automatically tabs if you’re already viewing the PDF from a browser.

However, as convenient as these features are, they don’t quite get over some of the inherent inconveniences of reading on a computer versus reading a print out. Below are a few ideas about minor updates that could make a big difference for “paper-less paper-reading.” I seriously doubt that I’m the only person to think these would be neat features, but I’ve learned that there’s no harm in asking. (You know the reference distances that are now at the bottom left of Google Maps? I e-mailed Google about that a couple of years ago.) Anyway, without further ado, some neat features that would encourage people to read papers online:

1. Commenting/highlighting. (Client-side software, pdf standards.) By far the best reason to print out papers is to be able to scribble on them with notes. In an ideal scenario, pdf readers would also allow users to insert collapsible notes—TeX/hyperlink enabled, of course—and highlighter marks. This could be useful, for example, in filling out steps of a calculation, linking to other papers/websites, or inserting one’s own thoughts. One could save the layer of comments in a separate text files so that readers can share comments easily via e-mail: one could then open up multiple layers of comments on a single paper. Commenting and highlighting are available in the not-free version of Adobe Acrobat, and similar features for multiple-user editing are available in major word processing programs.

2. Mouse Hover References. (Client-side software, TeX/pdf standards.) ‘Hypertex’-links are nice, allowing readers to click on references to equation numbers or citations to redirect Acrobat reader to a different part of the document or to open up a new browser window to the arXiv. The problem? You lose your place! On a paper printout, one can easily flip back and forth between different pages to keep up with equation number (2.16) or reference number [23]. On Acrobat, it’s a pain to click on such a link only to be unable to find your way back. The solution? Snap-style “previews” that appear when your mouse hovers over a link without clicking. A few examples (click for larger version):

Moving one’s mouse over a link brings up bibliographic data.

Moving one’s mouse over a previous equation number will bring up the equation. No need to awkwardly go back and forth looking for that page and then trying to find your way back.

Implementation for this is twofold: (1) the pdf standard must be updated to accomodate mouse-hover pop ups, and (2) ‘hypertex’ document classes must be revised to include this feature. The upside: this requires no additional work by manuscript authors. TeX files would be exactly the same, and old files could just be recompiled to yield the spiffy new pdfs.

3. Abstracts. (Bibtex styles.) Most TeX document classes have an eye for publication, resulting in bibliographies which give just enough information for a student to find the document online. Often this search is trivial, but it still requires a reader to divert attention from the actual document being read. That is to say that if one is doing a literature review on a subject, one has to shift back and forth between a document and the arXiv to keep a running list of relevant references. It sounds like a minor chore, but this sort of thing becomes annoying rather quickly when one loses the original paper one was reading amidst several tabbed browser windows of arXiv abstracts.

The solution? Add more information to the original file. At the very least, bibliographic styles could include paper titles instead of just journal references. This can be done almost trivially in a document style sheet. Adding a bit more, one could also include full abstracts; this would couple well with the ‘mouse hover’ references mentioned above. For users connected to the Internet, an ‘intelligent’ Acrbat reader could pull abstracts off the arXiv and insert it ‘dynamically’ into a pdf when one hovers over a reference.

4. Tabbed Browsing. (Client-side software.) Okay, maybe I’ve been a bit off the deep end suggesting changes to Acrobat that are very research-paper specific. Here’s an alternate solution to the problem of being navigated off a page every time one clicks on a pdf hyper(tex)-link: tabbed browsing. What is now the standard for Internet browsers can be implemented in pdf readers. Sure, clicking on an external link in a pdf will open up a new browser tab. What about internal links? This would be akin to the ‘split window’ feature that used to be more popular in word processing programs. Let users keep their place by having internal links open up in a new tab or new sub-window where they can reference something and easily return to the exact point of the document they departed from.

5. Fancy stuff. (Embedded applets.) I must also note that Professor Peskin at Stanford has some very neat ideas about packaging java applets with eprints as ‘active figures.’ These active figures would allow readers to input different parameters into relevant plots associated with the paper.

6. Killer app (look it up). (Hardware.) An orthogonal suggestion… pdf readers. The other big issue with reading pdfs rather than printouts is the fact that one needs to be at one’s computer. “eBook” readers have been out for some time now, but have not really found a significant market. Most are rather pricey and otherwise uninteresting because of limited options for inserting one’s own notes. There is some hope pocket-sized (or at least book-sized) pdf readers. Some PDAs (e.g. Palm Pilot) can read pdf files and allow users to scribble in thoughts, and in principle it shouldn’t be difficult for a Wi-Fi enabled device to allow some of the features mentioned above. Of note, I may never be cool (or uncool) enough to ever own a PDA.

7. Text-to-Speech. (Software.) I am, however, now cool enough to own an iPod Shuffle. There aren’t many grad-level podcasts available out there, but it may be worth considering an ePrint-to-speech program that generates podcasts out of the arXiv. Yes, I’m sure you’ve had the experience of the professor who seemd to `read straight out of the book/notes’ … but maybe audio lectures generated out of papers by a synthetic speech program could be useful. It will probably never replace actually sitting down with papers and sifting through the details, but in the aim of allowing grad students to ‘work’ even when they’re not working, it might be a welcome addition. Imagine being able to listen to the day’s new hep-ph abstracts as you take your morning jog. As an added bonus, because iPods are cool and hip, people will automatically think you’re cool and hip even though you’re listening to something geeky.

(This is like the story of the physics student who would sit in the quad with a physics textbook. In order to avoid looking like a geek, he has a naughty magazine—slightly larger than the textbook—inside the book so people will think the physics book was just a facade. In fact, inside the naughty magazine is another physics book, from which he is actually reading.)

Anyway, AT&T (or Cingular, or whatever they’re called now) seems to have developed text-to-speech technology significantly. The only question would be equations-to-speech (“TeX-to-speech”), which is tricky even when it is an experienced lecturer doing the conversion.

I should close by noting that all of the above is somewhat playful and speculative. I’m not sure how many people actually read papers online versus on paper, but with the volume of reserach papers appearing on a daily basis it may be nice to start making provisions for at least a partial shift from the latter to the former.


4 Responses to “Dear industry… how to help grad students learn”

  1. Clearly one of the main benefits of paper printouts, as you say, is that you can write on them. There are however a few programs which allow you to write on PDFs. I haven’t played with them for a couple of years, but there were a few that allowed for the addition of sort of post-it notes on top of a normal PDF. An alternative is simply to download the Latex file and change it yourself.

    I’ve been playing around recently with file tagging programs which allow you to add metadata to all files. I’ll be writing about this in some detail soon but the files within folders system is fast losing its appeal to me as the number of PDFs on my computer becomes more and more unwieldy. A system whereby you can tag by author, title, subject and keywords on your own computer is very useful but if this was the norm on the ArXiv for submitting files I think that would make life that much easier.


  2. 2 Sujit

    You know, I really like your 2nd point (mouse hover references). Hopefully someone will implement it, because it’s a great idea.

  3. 3 David

    Most of your ideas are very good, but Adobe does not have a very good record when it comes to PDAs. The version of the Acrobat Reader for the Palm is old (2003) and lousy. Apart from pagination problems it cannot cope with equations, as I found out when I purchased a thermodynamics textbook in pdf format specifically to read on my palm.

  1. 1 If Amazon ran the arXiv « An American Physics Student in England

%d bloggers like this: