First of all, I wanted to take a moment to say that this is the one-year anniversary of this blog. I’ve been posting every week, (almost always) on Friday, since I first was motivated to start blogging back in November 2012. It’s been a fun ride, through ups and downs, Ars Technica and Amplituhedra, and I hope it’s been fun for you, the reader, as well!
The word arXiv is pronounced much like the normal word archive, just think of the capital X like a Greek letter Chi.
Much as the name would suggest, arXiv is an archive, specifically a preprint archive. A pre-print is in a sense a paper before it becomes a paper; more accurately, it is a scientific paper that has not yet been published in a journal. In the past, such preprints would be kept by individual universities, or passed between interested individuals. Now arXiv, for an increasing range of fields (first physics and mathematics, now also computer science, quantitative biology, quantitative finance, and statistics) puts all of the preprints in one easily accessible, free to access place.
Different fields have different conventions when it comes to using arXiv. As a theoretical physicist, I can only really speak to how we use the system.
When theoretical physicists write a paper, it is often not immediately clear which journal we should submit it to. Different journals have different standards, and a paper that will gather more interest can be published in a more prestigious journal. In order to gauge how much interest a paper will raise, most theoretical physicists will put their papers up on arXiv as preprints first, letting them sit there for a few months to drum up attention and get feedback before formally submitting the paper to a journal.
The arXiv isn’t just for preprints, though. Once a paper is published in a journal, a copy of the paper remains on arXiv. Often, the copy on arXiv will be updated when the paper is updated, changed to the journal’s preferred format and labeled with the correct journal reference. So arXiv, ultimately, contains almost all of the papers published in theoretical physics in the last decade or two, all free to read.
But it’s not just papers! The digital format of arXiv makes it much easier to post other files alongside a paper, so that many people upload not just their results, but the computer code they used to generate them, or their raw data in long files. You can also post papers too long or unwieldy to publish in a journal, making arXiv an excellent dropping-off point for information in whatever format you think is best.
We stand at the edge of a new age of freely accessible science. As more and more disciplines start to use arXiv and similar services, we’ll have more flexibility to get more information to more people, while still keeping the advantage of peer review for publication in actual journals. It’s going to be very interesting to see where things go from here.