Category Archives: Life as a Physicist

Should You Read What You Cite? That Depends

When arXiv announced it would ban people for hallucinated citations, that is citations of papers that don’t exist, the discussion online got sidetracked by the question of whether academics actually read the papers they cite. Some people proudly insisted that any good scholar always reads every paper they reference, others argued that was ridiculous.

As always, the answer is never that simple. In certain fields, it is enormously important to read the papers you cite if you want to do solid, careful, scholarly work. In others, it’s entirely irrelevant.

It mostly comes down to what citations are for. And luckily, I’ve already written a post about that.

So let’s go through the citation motivations I mention in that post.

First, some citations are about respecting priority, feeding the system by which academics get credit for having an idea first. The incentive system of academia depends on getting this more or less right, but that doesn’t mean every academic has to check things at every step of the way. Besides, if you get this wrong, you’ll find out quickly. Submit a paper to a preprint server like arXiv, and you’ll be sure to get emails telling you that some obscure Soviet researcher figured it all out first.

Other citations are about substantiating claims. These are the most important to get right. Here, you really ought to have read, if not the whole paper, at least the full justification for the claim you’re making. You can have some leeway if the methods are unfamiliar enough, for example a complicated experiment you can’t understand all the details of. Science and technology do require some trust. But you should have at least a sense of where things could go wrong, and why.

Citations to provide context are a different beast. Here, you’re trying to tell a reader where your ideas come from. You can’t show them the conversations you have with your colleagues, the things they value and get you excited about. So you have to show them papers instead. But the papers aren’t the thing you read, they’re just a convenient proxy.

Finally, citations do sometimes just exist to follow social conventions. And yeah, you don’t have to read these, just like you don’t have to say how you’re doing when someone asks you how you’re doing. They’re the academic equivalent of social white lies, and should be taken roughly as seriously, both by their supporters and detractors.

Doing Things Well Is an International Activity

In the US, funding agencies seem to be increasingly opposed to an often inevitable feature of good science: international collaboration. Scientists have been told by officials at the National Institutes of Health that they need to remove mention of foreign collaborators from progress reports, or that they need to avoid such collaborations to begin with. At NASA, officials have told scientists that rather than just avoiding funding work in China, they should actively avoid collaborating with Chinese researchers. And a recently introduced bill would make that restriction more explicit.

I have a general policy against discussing concrete political issues on this blog, so I’m not going to dig into the details of who’s doing what here, how far it’s going or how novel it is. That policy extends to the comments. If you mention specific laws, politicians, or political parties, I will delete your comment.

I do want to say something more general, though. I think people often underestimate just how important international collaboration is.

I’ve talked before about how scientific specialization spreads scientists around the world. Scientists want to work with people who work on their specific interests, and there are often only a few people that fit that description. So people move across the world, creating centers of expertise.

More than that, though, essentially any activity, done well, is done internationally. The better you want to perform, the more likely it is that the best collaborator will be someone in another country.

People don’t notice this as much as they could, because they’re used to the exceptions. Popular art is often siloed by language and cultural references. Sports are intentionally set up as competitions between regions and nations, and militaries compete as a practical necessity. But without those exceptions, international competition wins out. The best doctor, the best classical musician, and the best businessperson for a job can’t be expected to come from one country or another. Those fields, like science, are international.

When that internationalism is weak, it’s a warning sign. Without that drive to succeed on an international stage, scientists get lazy. There are countries with a history of academic cronyism, where universities were run more on interpersonal politics than scholarly merit, cozy fiefdoms where prominent academics dole out positions. To combat this, policymakers work to make their research systems more international. They explicitly ask about international collaborations and participation in international conferences in grant applications, not to discourage them, but to encourage them: to reward academics who show merit on the international stage and break up lazy patronage networks.

It worries me that it sounds like some US policymakers want to do the opposite. People are increasingly worried about bias and groupthink in the sciences, and increasingly mad that scientists could be wasting the public’s money to maintain a cushy lifestyle. International collaboration is how you hold scientists to account, how you force them to compete and show their merit. If you drop that, academia is going to get a whole lot worse.

ArXiv Will Ban You for Hallucinated References

Thomas Dietterich, Chair of the Computer Science section of the preprint server arXiv.org, recently clarified the site’s policies towards “hallucinated” citations and other signs of careless use of AI in a post on X. If your paper contains a citation to a paper that doesn’t actually exist, or has other signs you didn’t read it before posting like leftover commentary (the example he gave was “here is a 200 word summary; would you like me to make any changes?”), then you can get banned from the arXiv for one year. Even after that year you’d be on a kind of “probation”, and would need to show that your next few papers had been accepted by peer-reviewed journals first before posting them.

At the risk of saying the obvious, this is a good idea! arXiv isn’t peer review, it isn’t meant to judge the value of the papers it hosts. But it still needs to be a useful place for scientists to post their papers, which is why they try to keep spam and irrelevant content to a minimum. If you don’t actually endorse the content of a paper, you shouldn’t post it in the first place.

That said, the whole existence of hallucinated citations on arXiv feels a little silly. It makes sense for academic journals and preprint servers in other fields. But arXiv was the first site of its kind for a reason. Its users, physicists, mathematicians, and computer scientists, don’t need much hand-holding when it comes to computers. Papers submitted to arXiv aren’t typically written in Word, they’re written in a document-writing language called LaTeX, that lets users make decently-formatted papers without help from a journal. Physicist-written code may be terrible by any reasonable criteria…but it exists, much more universally than for example biologist-written code.

This extends to citations. In my old field, there is a database called INSPIRE that updates automatically from arXiv. Click on a paper, and a handy “cite” link gives you standardized citations in several formats, ready to copy and paste into your LaTeX code. Nearly every citation in my papers is copied from there. The ones that aren’t are either from other fields where I didn’t know of that style of database, or things that haven’t been published (this can be manuscripts in preparation, or personal communications).

All of this, though, feels like a lot less than what the field could be doing. In a world where almost everyone posts their papers to the same website, and almost everyone has at least a rudimentary understanding of programming…why are people still writing citations in free-form text in the first place? Why aren’t citations built in to the submitted papers on arXiv, automatically linked to the papers they cite? Why don’t we have a setup where, except for a small number of “special” citations, every citation is built so that it automatically goes to a real paper, and gives a clear error message if it doesn’t? In short, why are hallucinated citations even possible?

Look, I’m naive, I get that. I believe in automation, not in the modern context of LLMs and other heuristics, but in setting clear procedures and building clear rules. The world doesn’t work that way! The clear rules are always more contentious than you expect, the fuzzy human-led version always the only choice people can agree on.

But still. Citations. There has to be a better system, right?

ArXiv to Leave Cornell

Yes, I’m late to the party on this one.

A few weeks ago, arXiv.org announced that it will be leaving Cornell, the university that currently manages it, and establishing its own nonprofit.

arXiv is a crucial part of the infrastructure for physics, mathematics, computer science, and a few related fields. Researchers post papers to arXiv as what are called “preprints” before the papers are submitted to a journal. In practice, nobody ends up reading the journal versions: the arXiv is free to access, and typically reflects better what the paper’s authors want the paper to look like. So in practice, arXiv is how researchers in these fields communicate, which makes its role enormously important.

If you’re from another field, you might wonder how something like arXiv is financially sustainable. The answer is that it works better than you’d think, but not perfectly. They’ve been supported by philanthropy, in addition to Cornell, and while there have apparently been budget shortfalls and drama behind the scenes, But nonetheless, arXiv has stayed in continuous operation since 1991.

The move to an independent nonprofit is supposed to make it easier for arXiv to get philanthropic funding, which otherwise needed to be filtered through Cornell in ways that were sometimes opaque or didn’t give donors the control they wanted.

While it wasn’t mentioned in the announcements, I suspect another motivation is security. Universities are fixed in place, and that makes them easier to pressure. For an organization that wants to process scientific output in an unbiased way, the link to Cornell represented a vulnerability. It’s not a vulnerability that has mattered yet, and likely didn’t seem like it would ever matter. But it wouldn’t surprise me if they’re more worried now that someone might try to pressure Cornell in order to change how arXiv operates. For critical scientific infrastructure, it’s important to be as independent of those kinds of pressure as possible.

The Twitter of Physics

The paper I talked about last week was frustratingly short. That’s not because the authors were trying to hide anything, or because they were lazy. It’s just that these days, that’s how the game is played.

Twitter started out with a fun gimmick: all posts had to be under 140 characters. The restriction inspired some great comedy, trying to pack as much humor as possible into a bite-sized format. Then, Twitter somehow became the place for journalists to discuss the news, tech people to discuss the industry, and politicians to discuss politics. Now, the length limit fuels conflict, an endless scroll of strong opinions without space for nuance.

Physics has something like this too.

In the 1950’s, it was hard for scientists to get the word out quickly about important results. The journal Physical Review had a trick: instead of normal papers, they’d accept breaking news in the form of letters to the editor, which they could publish more quickly than the average paper. In 1958, editor Samuel Goudsmit founded a new journal, Physical Review Letters (or PRL for short), that would publish those letters all in one place, enforcing a length limit to make them faster to process.

The new journal was a hit, and soon played host to a series of breakthrough results, as scientists chose it as a way to get their work out fast. That popularity created a problem, though. As PRL’s reputation grew, physicists started trying to publish there not because their results needed to get out fast, but because just by publishing in PRL, their papers would be associated with all of the famous breakthroughs the journal had covered. Goudsmit wrote editorials trying to slow this trend, but to no avail.

Now, PRL is arguably the most prestigious journal in physics, hosting over a quarter of Nobel prize-winning work. Its original motivation is no longer particularly relevant: the journal is not all that much faster than other journals in its area, if at all, and is substantially slower than the preprint server arXiv, which is where physicists actually read papers in practice.

The length limit has changed over the years, but not dramatically. It now sits at 3,750 words, typically allowing a five-or-six page article in tight two-column text.

If you see a physics paper on arXiv.org that fits the format, it’s almost certainly aimed at PRL, or one of the journals with similar policies that it inspired. It means the authors think their work is cool enough to hang out with a quarter of all Nobel-winning results, or at least would like it to be.

And that, in turn, means that anyone who wants to claim that prestige has to be concise. They have to leave out details (often, saving them for a later publication in a less-renowned journal). The results have to lean, by the journal’s nature, more to physicist-clickbait and a cleaned-up story than to anything their colleagues can actually replicate.

Is it fun? Yeah, I had some PRLs in my day. It’s a rush, shining up your work as far as it can go, trimming down complexities into six pages of essentials.

But I’m not sure it’s good for the field.

Most Academics Don’t Choose Their Specialty

It’s there in every biography, and many interviews: the moment the scientist falls in love with an idea. It can be a kid watching ants in the backyard, a teen peering through a telescope, or an undergrad seeing a heart cell beat on a slide. It’s a story so common that it forms the heart of the public idea of a scientist: not just someone smart enough to understand the world, but someone passionate enough to dive in to their one particular area above all else. It’s easy to think of it as a kind of passion most people never get to experience.

And it does happen, sometimes. But it’s a lot less common than you’d think.

I first started to suspect this as a PhD student. In the US, getting accepted into a PhD program doesn’t guarantee you an advisor to work with. You have to impress a professor to get them to spend limited time and research funding on you. In practice, the result was the academic analog of the dating scene. Students looked for who they might have a chance with, based partly on interest but mostly on availability and luck and rapport, and some bounced off many potential mentors before finding one that would stick.

Then, for those who continued to postdoctoral positions, the same story happened all over again. Now, they were applying for jobs, looking for positions where they were qualified enough and might have some useful contacts, with interest into the specific research topic at best a distant third.

Working in the EU, I’ve seen the same patterns, but offset a bit. Students do a Master’s thesis, and the search for a mentor there is messy and arbitrary in similar ways. Then for a PhD, they apply for specific projects elsewhere, and as each project is its own funded position the same job search dynamics apply.

The picture only really clicked for me, though, when I started doing journalism.

Nowadays, I don’t do science, I interview people about it. The people I interview are by and large survivors: people who got through the process of applying again and again and now are sitting tight in an in-principle permanent position. They’re people with a lot of freedom to choose what to do.

And so I often ask for that reason, that passion, that scientific love at first sight moment: why do you study what you do? It’s a story that audiences love, and thus that editors love, it’s always a great way to begin a piece.

But surprisingly often, I get an unromantic answer. Why study this? Because it was available. Because in the Master’s, that professor taught the intro course. Because in college, their advisor had contacts with that lab to arrange a study project. Because that program accepted people from that country.

And I’ve noticed how even the romantic answers tend to be built on the unromantic ones. The professors who know how to weave a story, to self-promote and talk like a politician, they’ll be able to tell you about falling in love with something, sure. But if you read between the lines, you’ll notice where their anecdotes fall, how they trace a line through the same career steps that less adroit communicators admit were the real motivation.

There’s been times I’ve thought that my problem was a lack of passion, that I wasn’t in love the same way other scientists were in love. I’ve even felt guilty, that I took resources and positions from people who were. There is still some truth in that guilt, I don’t think I had the same passion for my science as most of my colleagues.

But I appreciate more now, that that passion is in part a story. We don’t choose our specialty, making some grand agentic move. Life chooses for us. And the romance comes in how you tell that story, after the fact.

How Much Academic Attrition Is Too Much?

Have you seen “population pyramids“? They’re diagrams that show snapshots of a population, how many people there are of each age. They can give you an intuition for how a population is changing, and where the biggest hurdles are to survival.

I wonder what population pyramids would look like for academia. In each field and subfield, how many people are PhD students, postdocs, and faculty?

If every PhD student was guaranteed to become faculty, and the number of faculty stayed fixed, you could roughly estimate what this pyramid would have to look like. An estimate for the US might take an average 7-year PhD, two postdoc positions at 3 years each, followed by a 30-year career as faculty, and estimate the proportions of each stage based on proportions of each scholar’s life. So you’d have roughly one PhD student per four faculty, and one postdoc per five. In Europe, with three-year PhDs, the proportion of PhD students decreases further, and in a world where people are still doing at least two postdocs you expect significantly more postdocs than PhDs.

Of course, the world doesn’t look like that at all, because the assumptions are wrong.

The number of faculty doesn’t stay fixed, for one. When population is growing in the wider world, new universities open in new population centers, and existing universities find ways to expand. When population falls, enrollments shrink, and universities cut back.

But this is a minor perturbation compared to the much more obvious difference: most PhD students do not stay in academia. A single professor may mentor many PhDs at the same time, and potentially several postdocs. Most of those people aren’t staying.

You can imagine someone trying to fix this by fiat, setting down a fixed ratio between PhD students, postdocs, and faculty. I’ve seen partial attempts at this. When I applied for grants at the University of Copenhagen, I was told I had to budget at least half of my hires as PhD students, not postdocs, which makes me wonder if they were trying to force careers to default to one postdoc position, rather than two. More likely, they hadn’t thought about it.

Zero attrition doesn’t really make sense, anyway. Some people are genuinely better off leaving: they made a mistake when they started, or they changed over time. Sometimes new professions arise, and the best way in is from an unexpected direction. I’ve talked to people who started data science work in the early days, before there really were degrees in it, who felt a physics PhD had been the best route possible to that world. Similarly, some move into policy, or academic administration, or found a startup. And if we think there are actually criteria to choose better or worse academics (which I’m a bit skeptical of), then presumably some people are simply not good enough, and trying to filter them out earlier is irresponsible when they still don’t have enough of a track record to really judge.

How much attrition should be there is the big question, and one I don’t have an answer for. In academia, when so much of these decisions are made by just a few organizations, it seems like a question that someone should have a well-considered answer to. But so far, it’s unclear to me that anyone does.

It also makes me think, a bit, about how these population pyramids work in industry. There there is no overall control. Instead, there’s a web of incentives, many of them decades-delayed from the behavior they’re meant to influence, leaving each individual to try to predict as well as they can. If companies only hire senior engineers, no-one gets a chance to start a career, and the population of senior engineers dries up. Eventually, those companies have to settle for junior engineers. (Or, I guess, ex-academics.) It sounds like it should lead to the kind of behavior biologists model in predators and prey, wild swings in population modeled by a differential equation. But maybe there’s something that tamps down those wild swings.

A Paper With a Bluesky Account

People make social media accounts for their pets. Why not a scientific paper?

Anthropologist Ed Hagen made a Bluesky account for his recent preprint, “Menopause averted a midlife energetic crisis with help from older children and parents: A simulation study.” The paper’s topic itself is interesting (menopause is surprisingly rare among mammals, he has a plausible account as to why), but not really the kind of thing I cover here.

Rather, it’s his motivation that’s interesting. Hagen didn’t make the account out of pure self-promotion or vanity. Instead, he’s promoting it as a novel approach to scientific publishing. Unlike Twitter, Bluesky is based on an open, decentralized protocol. Anyone can host an account compatible with Bluesky on their own computer, and anyone with the programming know-how can build a computer program that reads Bluesky posts. That means that nothing actually depends on Bluesky, in principle: the users have ultimate control.

Hagen’s idea, then, is that this could be a way to fulfill the role of scientific journals without channeling money and power to for-profit publishers. If each paper is hosted on a scientist’s own site, the papers can link to each other via following each other. Scientists on Bluesky can follow or like the paper, or comment on and discuss it, creating a way to measure interest from the scientific community and aggregate reviews, two things journals are supposed to cover.

I must admit, I’m skeptical. The interface really seems poorly-suited for this. Hagen’s paper’s account is called @menopause-preprint.edhagen.net. What happens when he publishes another paper on menopause, what will he call it? How is he planning to keep track of interactions from other scientists with an account for every single paper, won’t swapping between fifteen Bluesky accounts every morning get tedious? Or will he just do this with papers he wants to promote?

I applaud the general idea. Decentralized hosting seems like a great way to get around some of the problems of academic publishing. But this will definitely take a lot more work, if it’s ever going to be viable on a useful scale.

Still, I’ll keep an eye on it, and see if others give it a try. Stranger things have happened.

Academia Tracks Priority, Not Provenance

A recent Correspondence piece in Nature Machine Intelligence points at an issue with using LLMs to write journal articles. LLMs are trained on enormous amounts of scholarly output, but the result is quite opaque: it is usually impossible to tell which sources influence a specific LLM-written text. That means that when a scholar uses an LLM, they may get a result that depends on another scholar’s work, without realizing it or documenting it. The ideas’ provenance gets lost, and the piece argues this is damaging, depriving scholars of credit and setting back progress.

It’s a good point. Provenance matters. If we want to prioritize funding for scholars whose ideas have the most impact, we need a way to track where ideas arise.

However, current publishing norms make essentially no effort to do this. Academic citations are not used to track provenance, and they are not typically thought of as tracking provenance. Academic citations track priority.

Priority is a central value in scholarship, with a long history. We give special respect to the first person to come up with an idea, make an observation, or do a calculation, and more specifically, the first person to formally publish it. We do this even if the person’s influence was limited, and even if the idea was rediscovered independently later on. In an academic context, being first matters.

In a paper, one is thus expected to cite the sources that have priority, that came up with an idea first. Someone who fails to do so will get citation request emails, and reviewers may request revisions to the paper to add in those missing citations.

One may also cite papers that were helpful, even if they didn’t come first. Tracking provenance in this way can be nice, a way to give direct credit to those who helped and point people to useful resources. But it isn’t mandatory in the same way. If you leave out a secondary source and your paper doesn’t use anything original to that source (like new notation), you’re much less likely to get citation request emails, or revision requests from reviewers. Provenance is just much lower priority.

In practice, academics track provenance in much less formal ways. Before citations, a paper will typically have an Acknowledgements section, where the authors thank those who made the paper possible. This includes formal thanks to funding agencies, but also informal thanks for “helpful discussions” that don’t meet the threshold of authorship.

If we cared about tracking provenance, those acknowledgements would be crucial information, an account of whose ideas directly influenced the ideas in the paper. But they’re not treated that way. No-one lists the number of times they’ve been thanked for helpful discussions on their CV, or in a grant application, no-one considers these discussions for hiring or promotion. You can’t look them up on an academic profile or easily graph them in a metascience paper. Unlike citations, unlike priority, there is essentially no attempt to measure these tracks of provenance in any organized way.

Instead, provenance is often the realm of historians or history-minded scholars, writing long after the fact. For academics, the fact that Yang and Mills published their theory first is enough, we call it Yang-Mills theory. For those studying the history, the story is murkier: it looks like Pauli came up with the idea first, and did most of the key calculations, but didn’t publish when it looked to him like the theory couldn’t describe the real world. What’s more, there is evidence suggesting that Yang knew about Pauli’s result, that he had read a letter from him on the topic, that the idea’s provenance goes back to Pauli. But Yang published, Pauli didn’t. And in the way academia has worked over the last 75 years, that claim of priority is what actually mattered.

Should we try to track provenance? Maybe. Maybe the emerging ubiquitousness of LLMs should be a wakeup call, a demand to improve our tracking of ideas, both in artificial and human neural networks. Maybe we need to demand interpretability from our research tools, to insist that we can track every conclusion back to its evidence for every method we employ, to set a civilizational technological priority on the accurate valuation of information.

What we shouldn’t do, though, is pretend that we just need to go back to what we were doing before.

Energy Is That Which Is Conserved

In school, kids learn about different types of energy. They learn about solar energy and wind energy, nuclear energy and chemical energy, electrical energy and mechanical energy, and potential energy and kinetic energy. They learn that energy is conserved, that it can never be created or destroyed, but only change form. They learn that energy makes things happen, that you can use energy to do work, that energy is different from matter.

Some, between good teaching and good students, manage to impose order on the jumble of concepts and terms. Others end up envisioning the whole story a bit like Pokemon, with different types of some shared “stuff”.

Energy isn’t “stuff”, though. So what is it? What relates all these different types of things?

Energy is something which is conserved.

The mathematician Emmy Noether showed that, when the laws of physics are symmetrical, they come with a conserved quantity. For example, because the laws of the physics are the same from place to place, momentum is conserved. Similarly, because the laws of physics are the same from one time to another, Noether’s theorem states that there must be some quantity related to time, some number we can calculate, that is conserved, even as other things change. We call that number energy.

If energy is that simple, why are there all those types?

Energy is a number we can calculate. It’s a number we can calculate for different things. If you have a detailed description of how something in physics works, you can use that description to calculate that thing’s energy. In school, you memorize formulas like \frac{1}{2}m v^2 and m g h. These are all formulas that, with a bit more knowledge, you could calculate. They are the things that, for a something that meets the conditions, are conserved. They are things that, according to Noether’s theorem, stay the same.

Because of this, you shouldn’t think of energy as a substance, or a fuel. Energy is something we can do: we physicists, and we students of physics. We can take a physical system, and see what about it ought to be conserved. Energy is an action, a calculation, a conceptual tool that can be used to make predictions.

Most things are, in the end.