The Timeline for Replacing Theorists Is Not Technological

Quanta Magazine recently published a reflection by Natalie Wolchover on the state of fundamental particle physics. The discussion covers a lot of ground, but one particular paragraph has gotten the lion’s share of the attention. Wolchover talked to Jared Kaplan, the ex-theoretical physicist turned co-founder of Anthropic, one of the foremost AI companies today.

Kaplan was one of Nima Arkani-Hamed’s PhD students, which adds an extra little punch.

There’s a lot to contest here. Is AI technology anywhere close to generating papers as good as the top physicists, or is that relegated to the sci-fi future? Does Kaplan really believe this, or is he just hyping up his company?

I don’t have any special insight into those questions, about the technology and Kaplan’s motivations. But I think that, even if we trusted him on the claim that AI could be generating Witten- or Nima-level papers in three years, that doesn’t mean it will replace theoretical physicists. That part of the argument isn’t a claim about the technology, but about society.

So let’s take the technological claims as given, and make them a bit more specific. Since we don’t have any objective way of judging the quality of scientific papers, let’s stick to the subjective. Today, there are a lot of people who get excited when Witten posts a new paper. They enjoy reading them, they find the insights inspiring, they love the clarity of the writing and their tendency to clear up murky ideas. They also find them reliable: the papers very rarely have mistakes, and don’t leave important questions unanswered.

Let’s use that as our baseline, then. Suppose that Anthropic had an AI workflow that could reliably write papers that were just as appealing to physicists as Witten’s papers are, for the same reasons. What happens to physicists?

Witten himself is retired, which for an academic means you do pretty much the same thing you were doing before, but now paid out of things like retirement savings and pension funds, not an institute budget. Nobody is going to fire Witten, there’s no salary to fire him from. And unless he finds these developments intensely depressing and demoralizing (possible, but very much depends on how this is presented), he’s not going to stop writing papers. Witten isn’t getting replaced.

More generally, though, I don’t think this directly results in anyone getting fired, or in universities trimming positions. The people making funding decisions aren’t just sitting on a pot of money, trying to maximize research output. They’ve got money to be spent on hires, and different pools of money to be spent on equipment, and the hires get distributed based on what current researchers at the institutes think is promising. Universities want to hire people who can get grants, to help fund the university, and absent rules about AI personhood, the AIs won’t be applying for grants.

Funding cuts might be argued for based on AI, but that will happen long before AI is performing at the Witten level. We already see this happening in other industries or government agencies, where groups that already want to cut funding are getting think tanks and consultants to write estimates that justify cutting positions, without actually caring whether those estimates are performed carefully enough to justify their conclusions. That can happen now, and doesn’t depend on technological progress.

AI could also replace theoretical physicists in another sense: the physicists themselves might use AI to do most of their work. That’s more plausible, but here adoption still heavily depends on social factors. Will people feel like they are being assessed on whether they can produce these Witten-level papers, and that only those who make them get hired, or funded? Maybe. But it will propagate unevenly, from subfield to subfield. Some areas will make their own rules forbidding AI content, there will be battles and scandals and embarrassments aplenty. It won’t be a single switch, the technology alone setting the timeline.

Finally, AI could replace theoretical physicists in another way, by people outside of academia filling the field so much that theoretical physicists have nothing more that they want to do. Some non-physicists are very passionate about physics, and some of those people have a lot of money. I’ve done writing work for one such person, whose foundation is now attempting to build an AI Physicist. If these AI Physicists get to Witten-level quality, they might start writing compelling paper after compelling paper. Those papers, though, will due to their origins be specialized. Much as philanthropists mostly fund the subfields they’ve heard of, philanthropist-funded AI will mostly target topics the people running the AI have heard are important. Much like physicists themselves adopting the technology, there will be uneven progress from subfield to subfield, inch by socially-determined inch.

In a hard-to-quantify area like progress in science, that’s all you can hope for. I suspect Kaplan got a bit of a distorted picture of how progress and merit work in theoretical physics. He studied with Nima Arkani-Hamed, who is undeniably exceptionally brilliant but also undeniably exceptionally charismatic. It must feel to a student of Nima’s that academia simply hires the best people, that it does whatever it takes to accomplish the obviously best research. But the best research is not obvious.

I think some of these people imagine a more direct replacement process, not arranged by topic and tastes, but by goals. They picture AI sweeping in and doing what theoretical physics was always “meant to do”: solve quantum gravity, and proceed to shower us with teleporters and antigravity machines. I don’t think there’s any reason to expect that to happen. If you just asked a machine to come up with the most useful model of the universe for a near-term goal, then in all likelihood it wouldn’t consider theoretical high-energy physics at all. If you see your AI as a tool to navigate between utopia and dystopia, theoretical physics might matter at some point: when your AI has devoured the inner solar system, is about to spread beyond the few light-minutes when it can signal itself in real-time, and has to commit to a strategy. But as long as the inner solar system remains un-devoured, I don’t think you’ll see an obviously successful theory of fundamental physics.

How Much Academic Attrition Is Too Much?

Have you seen “population pyramids“? They’re diagrams that show snapshots of a population, how many people there are of each age. They can give you an intuition for how a population is changing, and where the biggest hurdles are to survival.

I wonder what population pyramids would look like for academia. In each field and subfield, how many people are PhD students, postdocs, and faculty?

If every PhD student was guaranteed to become faculty, and the number of faculty stayed fixed, you could roughly estimate what this pyramid would have to look like. An estimate for the US might take an average 7-year PhD, two postdoc positions at 3 years each, followed by a 30-year career as faculty, and estimate the proportions of each stage based on proportions of each scholar’s life. So you’d have roughly one PhD student per four faculty, and one postdoc per five. In Europe, with three-year PhDs, the proportion of PhD students decreases further, and in a world where people are still doing at least two postdocs you expect significantly more postdocs than PhDs.

Of course, the world doesn’t look like that at all, because the assumptions are wrong.

The number of faculty doesn’t stay fixed, for one. When population is growing in the wider world, new universities open in new population centers, and existing universities find ways to expand. When population falls, enrollments shrink, and universities cut back.

But this is a minor perturbation compared to the much more obvious difference: most PhD students do not stay in academia. A single professor may mentor many PhDs at the same time, and potentially several postdocs. Most of those people aren’t staying.

You can imagine someone trying to fix this by fiat, setting down a fixed ratio between PhD students, postdocs, and faculty. I’ve seen partial attempts at this. When I applied for grants at the University of Copenhagen, I was told I had to budget at least half of my hires as PhD students, not postdocs, which makes me wonder if they were trying to force careers to default to one postdoc position, rather than two. More likely, they hadn’t thought about it.

Zero attrition doesn’t really make sense, anyway. Some people are genuinely better off leaving: they made a mistake when they started, or they changed over time. Sometimes new professions arise, and the best way in is from an unexpected direction. I’ve talked to people who started data science work in the early days, before there really were degrees in it, who felt a physics PhD had been the best route possible to that world. Similarly, some move into policy, or academic administration, or found a startup. And if we think there are actually criteria to choose better or worse academics (which I’m a bit skeptical of), then presumably some people are simply not good enough, and trying to filter them out earlier is irresponsible when they still don’t have enough of a track record to really judge.

How much attrition should be there is the big question, and one I don’t have an answer for. In academia, when so much of these decisions are made by just a few organizations, it seems like a question that someone should have a well-considered answer to. But so far, it’s unclear to me that anyone does.

It also makes me think, a bit, about how these population pyramids work in industry. There there is no overall control. Instead, there’s a web of incentives, many of them decades-delayed from the behavior they’re meant to influence, leaving each individual to try to predict as well as they can. If companies only hire senior engineers, no-one gets a chance to start a career, and the population of senior engineers dries up. Eventually, those companies have to settle for junior engineers. (Or, I guess, ex-academics.) It sounds like it should lead to the kind of behavior biologists model in predators and prey, wild swings in population modeled by a differential equation. But maybe there’s something that tamps down those wild swings.

School Facts and Research Facts

As you grow up, teachers try to teach you how the world works. This is more difficult than it sounds, because teaching you something is a much harder goal than just telling you something. A teacher wants you to remember what you’re told. They want you to act on it, and to generalize it. And they want you to do this not just for today’s material, but to set a foundation for next year, and the next. They’re setting you up for progress through a whole school system, with its own expectations.

Because of that, not everything a teacher tells you is, itself, a fact about the world. Some things you hear from teachers are liked the scaffolds on a building. They’re facts that only make sense in the context of school, support that lets you build to a point where you can learn other facts, and throw away the school facts that got you there.

Not every student uses all of that scaffolding, though. The scaffold has to be complete enough that some students can use it to go on, getting degrees in science or mathematics, and eventually becoming researchers where they use facts more deeply linked to the real world. But most students don’t become researchers. So the scaffold sits there, unused. And many people, as their lives move on, mistake the scaffold for the real world.

Here’s an example. How do you calculate something like this?

3+4\div (3-1)\times 5

From school, you might remember order of operations, or PEMDAS. First parentheses, then exponents, multiplication, division, addition, and finally subtraction. If you ran into that calculation in school, you could easily work it out.

But out of school, in the real world? Trick question, you never calculate something like that to begin with.

When I wrote this post, I had to look up how to write \div and \times. In the research world, people are far more likely to run into calculations like this:

3+5\frac{4}{3-1}

Here, it’s easier to keep track of what order you need to do things. In other situations, you might be writing a computer program (or an Excel spreadsheet formula, which is also a computer program). Then you follow that programming language’s rules for order of operations, which may or may not match PEMDAS.

PEMDAS was taught to you in school for good reason. It got you used to following rules to understand notation, and gave you tools the teachers needed to teach you other things. But it isn’t a fact about the universe. It’s a fact about school.

Once you start looking around for these “school facts”, they show up everywhere.

Are there really “three states of matter”, solid, liquid, and gas? Or four, if you add plasma? Well, sort of. There are real scientific definitions for solids, liquids, gases, and plasmas, and they play a real role in how people model big groups of atoms, “matter” in a quite specific sense. But they can’t be used to describe literally everything. If you start asking what state of matter light or spacetime is, you’ve substituted a simplification that was useful for school (“everything is one of three states of matter”) for the actual facts in the real world.

If you remember a bit further, maybe you remember there are two types of things, matter and energy? You might have even heard that matter and antimatter annihilate into energy. These are also just school facts, though. “Energy” isn’t something things are made of, it’s a property things have. Instead, your teachers were building scaffolding for understanding the difference between massive and massless particles, or between dark matter and dark energy. Each of those uses different concepts of matter and energy, and each in turn is different than the concept of matter in its states of solid, liquid, and gas. But in school, you need a consistent scaffold to learn, not a mess of different definitions for different applications. So unless you keep going past school, you don’t learn that.

Physics in school likes to work with forces, and forces do sometimes make an appearance in the real world, for example for engineers. But if you’re asking a question about fundamental physics, like “is gravity really a force?”, then you’re treating a school fact as if it was a research fact. Fundamental physics doesn’t care about forces in the same way. It uses different mathematical tools, like Lagrangians and Hamiltonians, to calculate the motion of objects in systems, and uses “force” in a pop science way to describe fundamental interactions.

If you get good enough at this, you can spot which things you learned in school were likely just scaffolding “school facts”, and which are firm enough that they may hold further. Any simple division of the world into categories is likely a school fact, one that let you do exercises on your homework but gets much more complicated when the real world gets involved. Contradictory or messy concepts are usually another sign, showing something fuzzy used to get students comfortable rather than something precise enough for professionals to use. Keep an eye out, and even if you don’t yet know the real facts, you’ll know enough to know what you’re missing.

A Paper With a Bluesky Account

People make social media accounts for their pets. Why not a scientific paper?

Anthropologist Ed Hagen made a Bluesky account for his recent preprint, “Menopause averted a midlife energetic crisis with help from older children and parents: A simulation study.” The paper’s topic itself is interesting (menopause is surprisingly rare among mammals, he has a plausible account as to why), but not really the kind of thing I cover here.

Rather, it’s his motivation that’s interesting. Hagen didn’t make the account out of pure self-promotion or vanity. Instead, he’s promoting it as a novel approach to scientific publishing. Unlike Twitter, Bluesky is based on an open, decentralized protocol. Anyone can host an account compatible with Bluesky on their own computer, and anyone with the programming know-how can build a computer program that reads Bluesky posts. That means that nothing actually depends on Bluesky, in principle: the users have ultimate control.

Hagen’s idea, then, is that this could be a way to fulfill the role of scientific journals without channeling money and power to for-profit publishers. If each paper is hosted on a scientist’s own site, the papers can link to each other via following each other. Scientists on Bluesky can follow or like the paper, or comment on and discuss it, creating a way to measure interest from the scientific community and aggregate reviews, two things journals are supposed to cover.

I must admit, I’m skeptical. The interface really seems poorly-suited for this. Hagen’s paper’s account is called @menopause-preprint.edhagen.net. What happens when he publishes another paper on menopause, what will he call it? How is he planning to keep track of interactions from other scientists with an account for every single paper, won’t swapping between fifteen Bluesky accounts every morning get tedious? Or will he just do this with papers he wants to promote?

I applaud the general idea. Decentralized hosting seems like a great way to get around some of the problems of academic publishing. But this will definitely take a lot more work, if it’s ever going to be viable on a useful scale.

Still, I’ll keep an eye on it, and see if others give it a try. Stranger things have happened.

On Theories of Everything and Cures for Cancer

Some people are disappointed in physics. Shocking, I know!

Those people, when careful enough, clarify that they’re disappointed in fundamental physics: not the physics of materials or lasers or chemicals or earthquakes, or even the physics of planets and stars, but the physics that asks big fundamental questions, about the underlying laws of the universe and where they come from.

Some of these people are physicists themselves, or were once upon a time. These often have in mind other directions physicists should have gone. They think that, with attention and funding, their own ideas would have gotten us closer to our goals than the ideas that, in practice, got the attention and the funding.

Most of these people, though, aren’t physicists. They’re members of the general public.

It’s disappointment from the general public, I think, that feels the most unfair to physicists. The general public reads history books, and hears about a series of revolutions: Newton and Maxwell, relativity and quantum mechanics, and finally the Standard Model. They read science fiction books, and see physicists finding “theories of everything”, and making teleporters and antigravity engines. And they wonder what made the revolutions stop, and postponed the science fiction future.

Physicists point out, rightly, that this is an oversimplified picture of how the world works. Something happens between those revolutions, the kind of progress not simple enough to summarize for history class. People tinker away at puzzles, and make progress. And they’re still doing that, even for the big fundamental questions. Physicists know more about even faraway flashy topics like quantum gravity than they did ten years ago. And while physicists and ex-physicists can argue about whether that work is on the right path, it’s certainly farther along its own path than it was. We know things we didn’t know before, progress continues to be made. We aren’t at the “revolution” stage yet, or even all that close. But most progress isn’t revolutionary, and no-one can predict how often revolutions should take place. A revolution is never “due”, and thus can never be “overdue”.

Physicists, in turn, often don’t notice how normal this kind of reaction from the public is. They think people are being stirred up by grifters, or negatively polarized by excess hype, that fundamental physics is facing an unfair reaction only shared by political hot-button topics. But while there are grifters, and people turned off by the hype…this is also just how the public thinks about science.

Have you ever heard the phrase “a cure for cancer”?

Fiction is full of scientists working on a cure for cancer, or who discovered a cure for cancer, or were prevented from finding a cure for cancer. It’s practically a trope. It’s literally a trope.

It’s also a real thing people work on, in a sense. Many scientists work on better treatments for a variety of different cancers. They’re making real progress, even dramatic progress. As many whose loved ones have cancer know, it’s much more likely for someone with cancer to survive than it was, say, twenty years ago.

But those cures don’t meet the threshold for science fiction, or for the history books. They don’t move us, like the polio vaccine did, from a world where you know many people with a disease to a world where you know none. They don’t let doctors give you a magical pill, like in a story or a game, that instantly cures your cancer.

For the vast majority of medical researchers, that kind of goal isn’t realistic, and isn’t worth thinking about. The few that do pursue it work towards extreme long-term solutions, like periodically replacing everyone’s skin with a cloned copy.

So while you will run into plenty of media descriptions of scientists working on cures for cancer, you won’t see the kind of thing the public expects is an actual “cure for cancer”. And people are genuinely disappointed about this! “Where’s my cure for cancer?” is a complaint on the same level as “where’s my hovercar?” There are people who think that medical science has made no progress in fifty years, because after all those news articles, we still don’t have a cure for cancer.

I appreciate that there are real problems in what messages are being delivered to the public about physics, both from hypesters in the physics mainstream and grifters outside it. But put those problems aside, and a deeper issue remains. People understand the world as best they can, as a story. And the world is complicated and detailed, full of many people making incremental progress on many things. Compared to a story, the truth is always at a disadvantage.

Where Are All These Views Coming From?

It’s been a weird year.

It’s been a weird year for many reasons, of course. But it’s been a particularly weird year for this blog.

To start, let me show you a more normal year, 2024:

Aside from a small uptick in January due to a certain unexpected announcement, this was a pretty typical year. I got 70-80 thousand views from 30-40 thousand unique visitors, spread fairly evenly throughout the year.

Now, take a look at 2025:

Something started happening this Fall. I went from getting 6000 views and 3000 visitors in a typical month, to roughly quintupling those numbers.

And for the life of me, I can’t figure out why.

WordPress, the site that hosts this blog, gives me tools to track where my viewers are coming from, and what they’re seeing.

It gives me a list of “referrers”, the other websites where people click on links to mine. Normally, this shows me where people are coming from: if I came up on a popular blog or reddit post, and people are following a link here. This year, though, looks totally normal. No new site is referring these people to me. Either the site they’re coming from is hidden, or they’re typing in my blog’s address by hand.

Looking at countries tells me a bit more. In a typical year, I get a bit under half of my views from the US, and the rest from a smattering of other English-speaking or European countries. This year, here’s what those stats look like:

So that tells me something. The new views appear to be coming from China. And what are these new viewers reading?

This year, my top post is a post from 2021, Reality as an Algebra of Observables. It wasn’t particularly popular when it came out, and while I liked the idea behind it, I don’t think I wrote it all that well. It’s not something that suddenly became relevant to the news, or to pop culture. It just suddenly started getting more and more and more views, this Fall:

In second place, a post about the 2022 Nobel Prize follows the same pattern. The pattern continues for a bit, but eventually the posts views get more uniform. My post France for Non-EU Spouses of EU Citizens, for example, has no weird pattern of increasing views: it’s just popular.

So far, this is weird. It gets weirder.

On a lark, I decided to look at the day-by-day statistics, rather than month-by-month. And before the growth really starts to show, I noticed something very strange.

In August, I had a huge number of views on August 1, a third of the month in one day. I had a new post out that day, but that post isn’t the one that gets the most views. Instead…it’s Reality as an Algebra of Observables.

That huge peak is a bit different from the later growth, though. It only shows in views, not in number of visitors. And it’s from the US, not China.

September, in comparison, looks normal. October looks like August, with a huge peak on October 3. This time, most of the views are still from the US, but a decent number are from China, and the visitors number is also higher.

In November, a few days in to the month, a new pattern kicks in:

Now, visitors and views are almost equal, as if each visitor shows up, looks at precisely one post, and leaves. The views are overwhelmingly from China, with 27 thousand out of 32 thousand views. And the most popular post, more popular even than my conveniently named 4gravitons.com homepage that usually tops the ratings…is Reality as an Algebra of Observables.

I don’t know what’s going on here, and I welcome speculation. Is this some extremely strange bot, accessing one unremarkable post of mine from a huge number of Chinese IP addresses? Or are there actual people reading this post? Was it shared on a Chinese social media app that WordPress can’t track? Maybe it’s part of a course?

For a while, I’d thought that if I somehow managed to get a lot more views, I could consider monetizing in some way, like opening a Patreon. History blogger Brett Deveraux gets around 140 thousand views on his top posts, and makes about three-quarters of his income from Patreon. If I could get a post a tenth as popular as his, maybe I could start making a little money from this blog?

The thing is, I can only do that if I have some idea of who’s viewing the blog, and what they want. And I don’t know why they want Reality as an Algebra of Observables.

For Newtonmas, One Seventeenth of a New Collider

Individual physicists don’t ask for a lot for Newtonmas. Big collaborations ask for more.

This year, CERN got its Newtonmas gift early: a one billion dollar pledge from a group of philanthropists and foundations, to be spent on their proposed new particle collider.

That may sound like a lot of money (and of course it is), but it’s only a fraction of the 15 billion euros that the collider is estimated to cost. That makes this less a case of private donors saving the project, and more of a nudge, showing governments they can get results for a bit cheaper than they expected.

I do wonder if the donation has also made CERN more bold about their plans, since it was announced shortly after a report from the update process for the European Strategy for Particle Physics, in which the European Strategy Group recommended a backup plan for the collider that is just the same collider with 15% budget cuts. Naturally people started making fun of this immediately.

Credit to @theory_dad on X

There were more serious objections from groups that had proposed more specific backup plans earlier in the process, who are frustrated that their ideas were rejected in favor of a 15% tweak that was not even discussed and seems not to really have been evaluated.

I don’t have any special information about what’s going on behind the scenes, or where this is headed. But I’m amused, and having fun with the parallels this season. I remember writing lists as a kid, trying to take advantage of the once-a-year opportunity to get what seemed almost like a genie’s wish. Whatever my incantations, the unreasonable requests were never fulfilled. Still, I had enough new toys to fill my time, and whet my appetite for the next year.

We’ll see what CERN’s Newtonmas gift brings.

Academia Tracks Priority, Not Provenance

A recent Correspondence piece in Nature Machine Intelligence points at an issue with using LLMs to write journal articles. LLMs are trained on enormous amounts of scholarly output, but the result is quite opaque: it is usually impossible to tell which sources influence a specific LLM-written text. That means that when a scholar uses an LLM, they may get a result that depends on another scholar’s work, without realizing it or documenting it. The ideas’ provenance gets lost, and the piece argues this is damaging, depriving scholars of credit and setting back progress.

It’s a good point. Provenance matters. If we want to prioritize funding for scholars whose ideas have the most impact, we need a way to track where ideas arise.

However, current publishing norms make essentially no effort to do this. Academic citations are not used to track provenance, and they are not typically thought of as tracking provenance. Academic citations track priority.

Priority is a central value in scholarship, with a long history. We give special respect to the first person to come up with an idea, make an observation, or do a calculation, and more specifically, the first person to formally publish it. We do this even if the person’s influence was limited, and even if the idea was rediscovered independently later on. In an academic context, being first matters.

In a paper, one is thus expected to cite the sources that have priority, that came up with an idea first. Someone who fails to do so will get citation request emails, and reviewers may request revisions to the paper to add in those missing citations.

One may also cite papers that were helpful, even if they didn’t come first. Tracking provenance in this way can be nice, a way to give direct credit to those who helped and point people to useful resources. But it isn’t mandatory in the same way. If you leave out a secondary source and your paper doesn’t use anything original to that source (like new notation), you’re much less likely to get citation request emails, or revision requests from reviewers. Provenance is just much lower priority.

In practice, academics track provenance in much less formal ways. Before citations, a paper will typically have an Acknowledgements section, where the authors thank those who made the paper possible. This includes formal thanks to funding agencies, but also informal thanks for “helpful discussions” that don’t meet the threshold of authorship.

If we cared about tracking provenance, those acknowledgements would be crucial information, an account of whose ideas directly influenced the ideas in the paper. But they’re not treated that way. No-one lists the number of times they’ve been thanked for helpful discussions on their CV, or in a grant application, no-one considers these discussions for hiring or promotion. You can’t look them up on an academic profile or easily graph them in a metascience paper. Unlike citations, unlike priority, there is essentially no attempt to measure these tracks of provenance in any organized way.

Instead, provenance is often the realm of historians or history-minded scholars, writing long after the fact. For academics, the fact that Yang and Mills published their theory first is enough, we call it Yang-Mills theory. For those studying the history, the story is murkier: it looks like Pauli came up with the idea first, and did most of the key calculations, but didn’t publish when it looked to him like the theory couldn’t describe the real world. What’s more, there is evidence suggesting that Yang knew about Pauli’s result, that he had read a letter from him on the topic, that the idea’s provenance goes back to Pauli. But Yang published, Pauli didn’t. And in the way academia has worked over the last 75 years, that claim of priority is what actually mattered.

Should we try to track provenance? Maybe. Maybe the emerging ubiquitousness of LLMs should be a wakeup call, a demand to improve our tracking of ideas, both in artificial and human neural networks. Maybe we need to demand interpretability from our research tools, to insist that we can track every conclusion back to its evidence for every method we employ, to set a civilizational technological priority on the accurate valuation of information.

What we shouldn’t do, though, is pretend that we just need to go back to what we were doing before.

Energy Is That Which Is Conserved

In school, kids learn about different types of energy. They learn about solar energy and wind energy, nuclear energy and chemical energy, electrical energy and mechanical energy, and potential energy and kinetic energy. They learn that energy is conserved, that it can never be created or destroyed, but only change form. They learn that energy makes things happen, that you can use energy to do work, that energy is different from matter.

Some, between good teaching and good students, manage to impose order on the jumble of concepts and terms. Others end up envisioning the whole story a bit like Pokemon, with different types of some shared “stuff”.

Energy isn’t “stuff”, though. So what is it? What relates all these different types of things?

Energy is something which is conserved.

The mathematician Emmy Noether showed that, when the laws of physics are symmetrical, they come with a conserved quantity. For example, because the laws of the physics are the same from place to place, momentum is conserved. Similarly, because the laws of physics are the same from one time to another, Noether’s theorem states that there must be some quantity related to time, some number we can calculate, that is conserved, even as other things change. We call that number energy.

If energy is that simple, why are there all those types?

Energy is a number we can calculate. It’s a number we can calculate for different things. If you have a detailed description of how something in physics works, you can use that description to calculate that thing’s energy. In school, you memorize formulas like \frac{1}{2}m v^2 and m g h. These are all formulas that, with a bit more knowledge, you could calculate. They are the things that, for a something that meets the conditions, are conserved. They are things that, according to Noether’s theorem, stay the same.

Because of this, you shouldn’t think of energy as a substance, or a fuel. Energy is something we can do: we physicists, and we students of physics. We can take a physical system, and see what about it ought to be conserved. Energy is an action, a calculation, a conceptual tool that can be used to make predictions.

Most things are, in the end.

Ideally, Exams Are for the Students

I should preface this by saying I don’t actually know that much about education. I taught a bit in my previous life as a professor, yes, but I probably spent more time being taught how to teach than actually teaching.

Recently, the Atlantic had a piece about testing accommodations for university students, like extra time on exams, or getting to do an exam in a special distraction-free environment. The piece quotes university employees who are having more and more trouble satisfying these accommodations, and includes the statistic that 20 percent of undergraduate students at Brown and Harvard are registered as disabled.

The piece has kicked off a firestorm on social media, mostly focused on that statistic (which conveniently appears just before the piece’s paywall). People are shocked, and cynical. They feel like more and more students are cheating the system, getting accommodations that they don’t actually deserve.

I feel like there is a missing mood in these discussions, that the social media furor is approaching this from the wrong perspective. People are forgetting what exams actually ought to be for.

Exams are for the students.

Exams are measurement tools. An exam for a class says whether a student has learned the material, or whether they haven’t, and need to retake the class or do more work to get there. An entrance exam, or a standardized exam like the SAT, predicts a student’s future success: whether they will be able to benefit from the material at a university, or whether they don’t yet have the background for that particular program of study.

These are all pieces of information that are most important to the students themselves, that help them structure their decisions. If you want to learn the material, should you take the course again? Which universities are you prepared for, and which not?

We have accommodations, and concepts like disability, because we believe that there are kinds of students for whom the exams don’t give this information accurately. We think that a student with more time, or who can take the exam in a distraction-free environment, would have a more accurate idea of whether they need to retake the material, or whether they’re ready for a course of study, than a student who has to take the exam under ordinary conditions. And we think we can identify the students who this matters for, and the students for whom this doesn’t matter nearly as much.

These aren’t claims about our values, or about what students deserve. They’re empirical claims, about how test results correlate with outcomes the students want. The conversation, then, needs to be built on top of those empirical claims. Are we better at predicting the success of students that receive accommodations, or worse? Can we measure that at all, or are we just guessing? And are we communicating the consequences accurately to students, that exam results tell them something useful and statistically robust that should help them plan their lives?

Values come in later, of course. We don’t have infinite resources, as the Atlantic piece emphasizes. We can’t measure everyone with as much precision as we would like. At some level, generalization takes over and accuracy is lost. There is absolutely a debate to be had about which measurements we can afford to make, and which we can’t.

But in order to have that argument at all, we first need to agree on what we’re measuring. And I feel like most of the people talking about this piece haven’t gotten there yet.