by Doctor Science
Last Tuesday, information activist Aaron Swartz was arrested for downloading almost the entire contents of JSTOR. This arrest is prompting a lot of discussion about the massive fail that is current academic publishing.
JSTOR is a non-profit academic archive, holding journals for both non-profit and for-profit publishers in a one-stop shop. Libraries and other institutions pay a start-up and then a yearly fee for access: these fees are on the order of $1000-$10K or more per year. Once an institution has paid the fee, users there can access JSTOR. There are no individual subscriptions. JSTOR waives or reduces its fees for institutions in developing countries, including the entire continent of Africa.
Presumably, Swartz intended to put the JSTOR archive up on some kind of free online server. After he was caught, he gave the hard drives with the downloaded material to JSTOR, and they decided not to pursue the matter further. Swartz's arrest and charges come from the US Attorney's office, which has decided to throw the book at him.
The NY Times notes:
Mr. Swartz recently completed a 10-month fellowship at the Edmond J. Safra Center for Ethics at Harvard. "Aaron has never done anything in this context for personal gain — this isn't a hacking case, in the sense of someone trying to steal credit cards," said Lawrence Lessig, the center's director. "That's something JSTOR saw, and the government obviously didn't."The fact that the new US Solicitor General is a former RIAA lawyer, one of 5 appointed to the Obama Justice Dept., IMHO is *not* coincidental:
The "messy debate" Henry talks about is because the state of academic publishing in the fields of science, technology and medicine is *profoundly* messed up.I wonder just how long the academic journal model can continue in its current form.There is a lot of messy debate here that is ready to explode. I wouldn't be surprised if one of the reasons that JSTOR didn't want to go ahead with this was precisely because it feared having a cause celebre, with an articulate, intellectually attractive and selfless defendant, explode out of it. And I confidently predict that there is going to be one very unhappy prosecutor who has no idea of the major political shitstorm that she is kicking up by doing this. Aaron is very rightly beloved by a whole lot of people – he's spent the last several years providing unpaid help for a variety of good causes.
If you're an author of a paper in STM who wants to publish in a peer-reviewed setting, these are your basic choices:
- A for-profit publisher
- won't charge you money
- will charge online readers, possibly reducing your paper's impact -- and impact factor, for all its many many faults, is the Holy Grail for tenure-track academics.
- will expect you to sign over your copyright "ensure the widest possible dissemination of information" (Cell, e.g.). Partly this means it's much easier for the paper to be reprinted for course packets or compendia (author copyright is a royal pain compared to publisher copyright), but also it means the publisher is more like to stay in business and keep your paper safe.
- A non-for-profit, JSTOR-access publisher
- will charge you or your institution a publication fee (hundreds of dollars or more)
- will charge readers for access
- won't acquire copyright, so you'll have to deal with course packet requests
- An "open access" publisher
- will charge you or your institution a whopping publication fee (hundreds to thousands of dollars)
- will not charge readers for access, so distribution will be as wide as possible
- won't acquire copyright
Convergence (2000-2001), by American artist Benjamin Edwards.
Right now in academic publishing, what you have is basically a lot of donor- and government-financed nonprofit organizations taking outputs with near-zero distribution costs (electronic journal archives) and selling them to each other. For any one institution, this kind of makes sense. A publisher doesn't want to give up his fees, which are valuable in meeting the costs of producing scholarship. But on net, it's a mix of pointless and pernicious. Sale of access to journals helps finance scholarship, but it also raises the cost of scholarship. If everything was distributed for free, the whole exact same enterprise could be undertaken with no net financial loss. But there would be huge potential gains. A precocious 17 year-old could have free access to scholarship. So could a researcher living and working in a poor country. Or even an earnest political reporter who's working on an issue and curious about what political science has to say about it. When I, personally, come across an article I'd like to read but can't get free access to, my standard practice is to tweet about it and then someone affiliated with a university sends it to me. That's good for me and, I think, good for the world. But there's no reason curious people should need to amass thousands of twitter followers before they're able to gain access to information that's been produced by non-profit institutions that are supposed to be serving the public interest.I knew a lot about this already, of course, but I hadn't realized how egregious the publication fees for online-only journal are.
For instance, the flagship open-access scientific enterprise is the Public Library of Science, PLOS, especially its rapid-publication journal, PLoS One. PLOS publication fees are currently on the order of $1300-$3000 per article -- though they will waive them for authors who aren't associated with a wealthy institution. These charges are not out of line for the field , either.
I was taken aback and even shocked by these costs. In the first place, I couldn't help noticing that such fees -- though usual and customary for STM journals -- somehow are not so necessary for journals in the humanities and many social sciences. For example, most open-access Genetics journals charge a fee, but few OA Linguistics journals do.
I'm also very surprised at these fees because I am intimately familiar with a completely different, *very* open-source publishing initiative, The Organization for Transformative Works. The OTW is also a 501(c), and runs Archive of Our Own (AO3), which Alexa.com says gets just about the same traffic as plosone.org.
The OTW owns our servers, and the cost of running them is around $10,000 per year -- technically-skilled volunteers provide the labor. We also have a peer-reviewed journal that appears twice a year.
To be honest, I have *no* idea why online scientific publication should be so heinously expensive, when scholars in the humanities -- or a well-organized collection of fangirls -- can do it for so much less. Phillip Lord suggests that:
The reason that PLoS one costs so much is because, despite being an innovative idea, they have used an old fashioned publication process — they still page layout articles, even though a web browser can do the same job for free.One scientist wondered (pers.comm)
Could this be an opportunistic phenomenon running wild, one caused by the fact that so much science and engineering research draws funds from external sources?This explanation seems intuitively plausible to me, given that PLOS, for instance, refers to costs "including those of peer review, of journal production, and of online hosting and archiving" -- when the work of peer review is done for free, online journal production *should* be largely done by the software, and hosting and archiving even PLOS *should* be costing on the order of tens of thousands of dollars per year. Not to mention that prices keep going even as volume goes up -- as though economies of scale aren't operating, and neither is Moore's Law.
The *only* truly large cost I can think of that might be legitimately driving STM journals' costs so high is herding peer-review cats. As of late 2010, PLoS One had over 35,000 peer reviewers available. Matching up the peer reviewers to the articles, making sure they can do the work, chivvying them (nicely!) until it's done -- I can see how this would take a full-time editor if to get the paper published within 4 to 6 weeks. Note, though, that the Academic and Section Editors donate their time and expertise, as do the peer reviewers, and that PLOS One does not copy-edit. The pro editors seem to be cat-herders, not text-wranglers -- and cat-herding (where the cats all have professorial egos, but I repeat myself) will naturally pay better than wrangling mere text, which any liberal arts major could do.
I also wonder if PLOS and the other OA journals can't have cheaper staff because they're competing for employees with for-profit STM publishers. For-profit STM publishing is *extremely* profitable -- I've seen estimates of from 15-35% net profits. The cost of for-profit journals has increased much faster than for non-profits, all of which means that for-profit publishers should be able to pay their own cat-herders and other editors extremely well, which drives up everyone's estimates of the usual and customary costs of cat-herding.
To get back to JSTOR, where we began. One of the things I find truly reprehensible about JSTOR is that they charge for online access to material they hold that is in the public domain.
For instance, JSTOR has digitized and put online the complete run of the Philosophical Transactions of the Royal Society, all the way back to 1665. Great! But if you want to read, say, "On the Air-Engine, by J.P. Joule (1852), JSTOR's version is behind the paywall. This particular article is available on archive.org, but this is unusual -- in most cases, if you want to read this unquestionably public-domain article, you have to pay.
Painting of the Royal Society, on display *in* the Royal Society. I can't find the name of the artist.
I'm not the only one who's been annoyed by this, and after Swartz's arrest almost 19,000 public-domain articles from the Phil. Trans. were posted on piratebay. The uploader, "Greg Maxwell", said:
I've had these files for a long time, but I've been afraid that if I published them I would be subject to unjust legal harassment by those who profit from controlling access to these works.Since the vast majority of all scientific publication is still under copyright, I think publishers (and JSTOR) should throw in digitization and hosting of public-domain material for free -- they can call it a public service if they like. Very few journals have runs going back before the mid-1920s, so it shouldn't be a big deal to digitize the few that do, and fold it into the cost of the greater project. And it would make them look good, and thorough, and like people who are caretakers of knowledge -- not just out to make a buck.
I now feel that I've been making the wrong decision.
Those with the most power to change the system--the long-tenured luminary scholars whose works give legitimacy and prestige to the journals, rather than the other way around--are the least impacted by its failures. They are supported by institutions who invisibly provide access to all of the resources they need. And as the journals depend on them, they may ask for alterations to the standard contract without risking their career on the loss of a publication offer. Many don't even realize the extent to which academic work is inaccessible to the general public, nor do they realize what sort of work is being done outside universities that would benefit by it.
UPDATE: I posted the above while pushing my brain through a swamp of phlegm, courtesy of what happens to my sinuses when I go from 100 degree weather to air conditioning and back again multiple times a day. Here's some additional squeezings:
I was eliding over a couple of steps. Most for-profit publishers and many non-profit (but important) publishers (learned societies, especially) have their own websites with their own paywalls. JSTOR is merely the largest of the one-stop shop sites, where all kinds of publishers can put up their content without the bother of maintaining an archive and paywall.
To illustrate, here's the result of a Google Scholar search for a randomly-chosen topic, fish evolution:
Some papers are only available behind one or another paywall; some are behind a publisher's paywall, but also on the author's academic website (generally with 6 months to a year delay after publication); some are at JSTOR; etc. If it weren't for Google Scholar putting the free access links out in a separate column, we'd all run mad, mad! I tell you.
You'll also note that the US National Institutes of Health, some other government funders, and some major-league private funders (such as the UK's Wellcome Trust) now require (instead of just "ask nicely", which they used to do before) that papers they've paid for go into a public OA archive. In the US this is PubMedCentral. Again, there's a delay of 6 or 12 months during which the publisher gets exclusive distribution rights.