If Digg (or Reddit) ran the arXiv
Bee, a stalwart for arXiv evolution, left a nice comment on a previous post. Since that post generated a blip of interest, I thought I’d share some of my [less flippant] thoughts on the future of the arXiv. This will be the first of three posts on the topic. (And then I have some posts on actual physics I wanted to get to.)
Digg doesn’t run the arXiv. (Cornell does.) But as one of the main Web 2.0 link aggregation hubs, it offers an important lesson regarding discussions of how to augment the arXiv.
The arXiv is delicate
The point that has been brought up a few times has been we can improve the arXiv by allowing comments and ratings for e-prints. Before we even get into any pragmatic discussions, there’s a big issue that needs to be made explicit:
The arXiv is a scientific resource and must be mainfestly unbiased.
This is important. This is part of the Baconian underpinnings of the scientific method. It’s like a Hippocratic oath for scientists. Just about every physicist uses the arXiv: the consequences of any `political’ bias would have serious moral and practical repercussions.
So however we propose to improve the arXiv, we absolutely must respect its impartiality.
The danger of introducing ratings and comments to the arXiv is that you open up this system to people who have agendas: self-promoting authors who must publish-or-perish, nepotist reviewers, crackpots, or even well-meaning-but-ill-informed scientists. The big question, then, is how to implement a system that takes [biased] user input to distinguish between papers without introducing bias into the system.
How Digg Works (heuristic)
- A user finds a neat site and recommends it under one of Digg’s categories
- Other users see the link and either digg it (rate it positively) or bury it (rate it negatively). They can offer comments in their judgements.
- After a critical mass of positive reviews, links appear on the main page (I assume this is also some function of rate-of-positive-reviews)
- Users are also able to digg/bury comments, so that irrelevant comments/judgements are weighted less (and are also banished from the comments page)
Who critiques the critics?
So here’s the take-home lesson from Digg: rate the raters. (c.f. Quis custodiet ipsos custodes? or something like that.) If you open up the arXiv to commenters/ratings, then empower the community to keep it fair. By allowing users to vote down poor comments (and commenters) while voting up those which make reasonable points, a sufficiently active community of researchers can minimize the effect of crackpots and self-promoters.
This is somewhat analogous to a system whose quantum corrections (effective action) protect it from spontaneous symmetry breaking. Users are biased and will introduce their own bias in their individual assessments (some more than others), but the community as a whole counteracts this by pushing for balanced and fair assessment.
This is non-trivial. The moment you set up rules (and you’ll need a lot of them), you have to assume people will try to get around them. You will need user authentication, a weighting system to determine which users have the largest influence (perhaps correlated publication record), and all sorts of checks for people who display an `agenda.’ This should also be completely automated, since the point was to be independent of small sets of potentially-biased users.
On the one hand, that’s a pain. I’m not even sure if there exists an acceptable set of rules that would generate a working system. On the other hand, it is—at least in theory—a step towards what we wanted: a meritocratic system that identifies hot papers while they’re still somewhat cutting edge.
Forced Democracy (`vote or die’)
The first major observation is that a critical mass of users is prerequisite for any kind of system like this. This is why Scirate does not work: most papers don’t get comments, and those that do usually only get a statistically insignificant handful.
Let’s think about the numbers involved. I estimated in a previous post that the hep-ph community is about 1000 researchers, with approximately one third of them publishing a paper per month and the rest effectively not publishing papers. This means every month there are around 333 papers that should get some sort of feedback , even if it’s just a yay/nay vote. If we say that, on the average, we need 12 votes per paper to have a meaningful ranking (allowing controversial/hot papers to get more and vanilla papers to get less), then this means that each researcher will have to read and vote on one paper per week.
That’s not bad. But that’s assuming 100% participation of the entire hep-ph community. If you take a more realistic view of only 50% participation, this doubles to two papers per week. We’ve also been assuming participation every week. If we assume that with teaching and conferences, the average researcher is actively checking the arXiv only 26 out of the 52 weeks every year, then each research has to read and vote for four papers per week. Now this is turning into a journal club. That’s a lot of responsibility!
This is a very rough estimate, but the main point is that this level of participation is necessary for the system to work. Otherwise, the system isn’t robust enough to protect itself against opportunistic crackpots and self-promotion.
In the parlance of renormalization group flow (following Bee), this level of participation is a critical point. If the system is tuned to less participation, then the `best system to adopt’ is to have no comments. If the system is tuned to a higher level of participation, then the `best system to adopt’ is a Digg-model.
A second observation about Digg: it helps that one can categorize one’s submission to a predefined category. (See my mock-up illustration above or below.) Instead of just hep-ph, why not specify hep-ph: model-building? Or why not go further and specify SUSY, leptogenesis, etc. etc.?
One can go even further and evolve beyond categories to incorporate dynamic tagging, one of the quiet revolutionary features of the Web 2.0. But this will be reserved for a future post where I can draw on better examples.
What’s really nice about categories is that one can form a separate category for review articles. This way people looking for pedagogical literature can have a separate section where ranking is based primarily on the pedagogy rather than the citations. (This is part of my campaign to make life a little easier for grad students.)
A third observation: whether or not a link is highly rated on Digg is only sensitive to large timescales. Digg is like the New York Times bestselling book list. It’s unlikely that a top book will still be around one year later when everyone’s already read it, but a #1 book will stick in the top 10 longer than a #25 book will stick in the top 50.
I’m not sure if this `timelessness’ is a positive or a negative feature of the Digg-system. On the one hand, it tells you which papers are hot. It does this in a way that’s still after-the-fact (i.e. you’ve missed the gravy train because everyone else is already working on it). However, one would at least be able to identify a hot paper before it became highly-cited on SPIRES since one would be judging `hotness’ based on comments rather than citations.
Peer-review?A fourth comment: I’ve deliberately said nothing about the extent to which these comments constitute peer review. Comments could include questions, errata, complaints about citations (in which case they are just like peer review); but at no point did we make any requirements on the specific content. That judgement is left to the community at large.
However, peer review is based on a specialized community judging your work. You don’t want a graduate student working in braneworld theories to `peer review’ your paper on experimental constraints on flavour physics. You might still want his or her comment, which may turn out to be very valuable, but they’re not an expert to judge your paper. Thus one has to be careful about the extent to which a Digg-like `democratic’ system can be considered `peer review.’
An aside: Remember guestbooks? I grew up back when people were figuring out how to make web pages from HTML. Back then the `cool’ thing to do was to add a CGI script to one’s homepage to include a guestbook where visitors can comment on one’s wonderful animated-gifs and tacky midi background music.
Well, guestbooks went out of style with flannel in the early 90s, when both the web and I were young. 🙂
What I wonder is whether a Digg-style commenting system would also go out of style. The danger is that if people stop using it, there has to be a phase transition (see above) where the entire system suddenly becomes worthless to everyone.
Issues and caveats
- Such an arXiv would require users to sign-up. Thus it automatically becomes a pain-in-the-butt.
- The precise formulae used to determine a commenter’s reputability and a paper’s `hotness’ is subjective.
- Responsible physicists check every item of the relevant arXiv categories. A Digg-style ranking system should not replace this habit, but instead augment it.
- Such a system is still not immune to trends and fads.
A larger mock-up, just for fun.
Filed under: Science 2.0 | 9 Comments