I’ve found that when it comes to reading papers, there are two distinct things I look for.
Sometimes, I read a paper looking for an answer. Typically, this is a “how to” kind of answer: I’m trying to do something, and the paper I’m reading is supposed to explain how. More rarely, I’m directly using a result: the paper proved a theorem or compute a formula, and I just take it as written and use it to calculate something else. Either way, I’m seeking out the paper with a specific goal in mind, which typically means I’m reading it long after it came out.
Other times, I read a paper looking for a question. Specifically, I look for the questions the author couldn’t answer. Sometimes these are things they point out, limitations of their result or opportunities for further study. Sometimes, these are things they don’t notice, holes or patterns in their results that make me wonder “what if?” Either can be the seed of a new line of research, a problem I can solve with a new project. If I read a paper in this way, typically it just came out, and this is the first time I’ve read it. When that isn’t the case, it’s because I start out with another reason to read it: often I’m looking for an answer, only to realize the answer I need isn’t there. The missing answer then becomes my new question.
I’m curious about the balance of these two behaviors in different fields. My guess is that some fields read papers more for their answers, while others read them more for their questions. If you’re working in another field, let me know what you do in the comments!
A reader pointed me to Stephen Wolfram’s one-year update of his proposal for a unified theory of physics. I was pretty squeamish about it one year ago, and now I’m even less interested in wading in to the topic. But I thought it would be worth saying something, and rather than say something specific, I realized I could say something general. I thought I’d talk a bit about how we judge good and bad research in theoretical physics.
In science, there are two things we want out of a new result: we want it to be true, and we want it to be surprising. The first condition should be obvious, but the second is also important. There’s no reason to do an experiment or calculation if it will just tell us something we already know. We do science in the hope of learning something new, and that means that the best results are the ones we didn’t expect.
(What about replications? We’ll get there.)
If you’re judging an experiment, you can measure both of these things with statistics. Statistics lets you estimate how likely an experiment’s conclusion is to be true: was there a large enough sample? Strong enough evidence? It also lets you judge how surprising the experiment is, by estimating how likely it would be to happen given what was known beforehand. Did existing theories and earlier experiments make the result seem likely, or unlikely? While you might not have considered replications surprising, from this perspective they can be: if a prior experiment seems unreliable, successfully replicating it can itself be a surprising result.
If instead you’re judging a theoretical result, these measures get more subtle. There aren’t always good statistical tools to test them. Nonetheless, you don’t have to rely on vague intuitions either. You can be fairly precise, both about how true a result is and how surprising it is.
We get our results in theoretical physics through mathematical methods. Sometimes, this is an actual mathematical proof: guaranteed to be true, no statistics needed. Sometimes, it resembles a proof, but falls short: vague definitions and unstated assumptions mar the argument, making it less likely to be true. Sometimes, the result uses an approximation. In those cases we do get to use some statistics, estimating how good the approximation may be. Finally, a result can’t be true if it contradicts something we already know. This could be a logical contradiction in the result itself, but if the result is meant to describe reality (note: not always the case), it might contradict the results of a prior experiment.
What makes a theoretical result surprising? And how precise can we be about that surprise?
Theoretical results can be surprising in the light of earlier theory. Sometimes, this gets made precise by a no-go theorem, a proof that some kind of theoretical result is impossible to obtain. If a result finds a loophole in a no-go theorem, that can be quite surprising. Other times, a result is surprising because it’s something no-one else was able to do. To be precise about that kind of surprise, you need to show that the result is something others wanted to do, but couldn’t. Maybe someone else made a conjecture, and only you were able to prove it. Maybe others did approximate calculations, and now you can do them more precisely. Maybe a question was controversial, with different people arguing for different sides, and you have a more conclusive argument. This is one of the better reasons to include a long list of references in a paper: not to pad your friends’ citation counts, but to show that your accomplishment is surprising: that others might have wanted to achieve it, but had to settle for something lesser.
In general, this means that showing whether a theoretical result is good: not merely true, but surprising and new, links you up to the rest of the theoretical community. You can put in all the work you like on a theory of everything, and make it as rigorous as possible, but if all you did was reproduce a sub-case of someone else’s theory then you haven’t accomplished all that much. If you put your work in context, compare and contrast to what others have done before, then we can start getting precise about how much we should be surprised, and get an idea of what your result is really worth.
There are theoretical physicists who can do everything they do with a pencil and a piece of paper. I’m not one of them. The calculations I do are long, complicated, or tedious enough that they’re often best done with a computer. For a calculation like that, I can’t just use existing software “out of the box”: I need to program special-purpose tools to do the kind of calculation I need. This means each project has its own kind of learning curve. If I already have the right code, or almost the right code, things go very smoothly: with a few tweaks I can do a lot of interesting calculations. If I don’t have the right code yet, things go much more slowly: I have to build up my technology, figuring out what I need piece by piece until I’m back up to my usual speed.
I don’t always need to use computers to do my calculations. Sometimes my work hinges on something more conceptual: understanding a mathematical proof, or the arguments from another physicist’s paper. While this seems different on the surface, I’ve found that it has the same kinds of learning curves. If I know the right papers and mathematical methods, I can go pretty quickly. If I don’t, I have to “build up my technology”, reading and practicing, a slow build-up to my goal.
The times when I have to “build my technology” are always a bit frustrating. I don’t work as fast as I’d like, and I get tripped up by dumb mistakes. I keep having to go back, almost to the beginning, realizing that some aspect of how I set things up needs to be changed to make the rest work. As I go, though, the work gets more and more satisfying. I find pieces (of the code, of my understanding) that become solid, that I can rely on. I build my technology, and I can do more and more, and feel better about myself in the bargain. Eventually, I get back up to my full abilities, my technology set up, and a wide variety of calculations become possible.
Yesterday, Fermilab’s Muon g-2 experiment announced a new measurement of the magnetic moment of the muon, a number which describes how muons interact with magnetic fields. For what might seem like a small technical detail, physicists have been very excited about this measurement because it’s a small technical detail that the Standard Model seems to get wrong, making it a potential hint of new undiscovered particles. Quanta magazine has a great piece on the announcement, which explains more than I will here, but the upshot is that there are two different calculations on the market that attempt to predict the magnetic moment of the muon. One of them, using older methods, disagrees with the experiment. The other, with a new approach, agrees. The question then becomes, which calculation was wrong? And why?
What does it mean for a prediction to match an experimental result? The simple, wrong, answer is that the numbers must be equal: if you predict “3”, the experiment has to measure “3”. The reason why this is wrong is that in practice, every experiment and every prediction has some uncertainty. If you’ve taken a college physics class, you’ve run into this kind of uncertainty in one of its simplest forms, measurement uncertainty. Measure with a ruler, and you can only confidently measure down to the smallest divisions on the ruler. If you measure 3cm, but your ruler has ticks only down to a millimeter, then what you’re measuring might be as large as 3.1cm or as small as 2.9 cm. You just don’t know.
This uncertainty doesn’t mean you throw up your hands and give up. Instead, you estimate the effect it can have. You report, not a measurement of 3cm, but of 3cm plus or minus 1mm. If the prediction was 2.9cm, then you’re fine: it falls within your measurement uncertainty.
There’s a common thread in all of these uncertainty estimates: you don’t expect to be too far off on average. Your measurements won’t be perfect, but they won’t all be screwed up in the same way either: chances are, they will randomly be a little below or a little above the truth. Your calculations are similar: whether you’re ignoring complicated particle physics diagrams or the spacing in a simulated grid, you can treat the difference as something small and random. That randomness means you can use statistics to talk about your errors: you have statistical uncertainty. When you have statistical uncertainty, you can estimate, not just how far off you might get, but how likely it is you ended up that far off. In particle physics, we have very strict standards for this kind of thing: to call something new a discovery, we demand that it is so unlikely that it would only show up randomly under the old theory roughly one in a million times. The muon magnetic moment isn’t quite up to our standards for a discovery yet, but the new measurement brought it closer.
The two dueling predictions for the muon’s magnetic moment both estimate some amount of statistical uncertainty. It’s possible that the two calculations just disagree due to chance, and that better measurements or a tighter simulation grid would make them agree. Given their estimates, though, that’s unlikely. That takes us from the realm of theoretical uncertainty, and into uncertainty about the theoretical. The two calculations use very different approaches. The new calculation tries to compute things from first principles, using the Standard Model directly. The risk is that such a calculation needs to make assumptions, ignoring some effects that are too difficult to calculate, and one of those assumptions may be wrong. The older calculation is based more on experimental results, using different experiments to estimate effects that are hard to calculate but that should be similar between different situations. The risk is that the situations may be less similar than expected, their assumptions breaking down in a way that the bottom-up calculation could catch.
None of these risks are easy to estimate. They’re “unknown unknowns”, or rather, “uncertain uncertainties”. And until some of them are resolved, it won’t be clear whether Fermilab’s new measurement is a sign of undiscovered particles, or just a (challenging!) confirmation of the Standard Model.
When a scientist applies for a grant to fund their research, there’s a way it’s supposed to go. The scientist starts out with a clear idea, a detailed plan for an experiment or calculation they’d like to do, and an expectation of what they could learn from it. Then they get the grant, do their experiment or calculation, and make their discovery. The world smiles upon them.
There’s also a famous way it actually goes. Like the other way, the scientist has a clear idea and detailed plan. Then they do their experiment, or calculation, and see what they get, making their discovery. Finally, they write their grant application, proposing to do the experiment they already did. Getting the grant, they then spend the money on their next idea instead, which they will propose only in the next grant application, and so on.
This is pretty shady behavior. But there’s yet another way things can go, one that flips the previous method on its head. And after considering it, you might find the shady method more understandable.
What happens if a scientist is going to run out of funding, but doesn’t yet have a clear idea? Maybe they don’t know enough yet to have a detailed plan for their experiment or their calculation. Maybe they have an idea, but they’re still foggy about what they can learn from it.
Well, they’re still running out of funding. They still have to write that grant. So they start writing. Along the way, they’ll manage to find some of that clarity: they’ll have to write a detailed plan, they’ll have to describe some expected discovery. If all goes well, they tell a plausible story, and they get that funding.
When they actually go do that research, though, there’s no guarantee it sticks to the plan. In fact, it’s almost guaranteed not to: neither the scientist nor the grant committee typically knows what experiment or calculation needs to be done: that’s what makes the proposal novel science in the first place. The result is that once again, the grant proposal wasn’t exactly honest: it didn’t really describe what was actually going to be done.
You can think of these different stories as falling on a sliding scale. On the one end, the scientist may just have the first glimmer of an idea, and their funded research won’t look anything like their application. On the other, the scientist has already done the research, and the funded research again looks nothing like the application. In between there’s a sweet spot, the intended system: late enough that the scientist has a good idea of what they need to do, early enough that they haven’t done it yet.
How big that sweet spot is depends on the pace of the field. If you’re a field with big, complicated experiments, like randomized controlled trials, you can mostly make this work. Your work takes a long time to plan, and requires sticking to that plan, so you can, at least sometimes, do grants “the right way”. The smaller your experiments are though, the more the details can change, and the smaller the window gets. For a field like theoretical physics, if you know exactly what calculation to do, or what proof to write, with no worries or uncertainty…well, you’ve basically done the calculation already. The sweet spot for ethical grant-writing shrinks down to almost a single moment.
In practice, some grant committees understand this. There are grants where you are expected to present preliminary evidence from work you’ve already started, and to discuss the risks your vaguer ideas might face. Grants of this kind recognize that science is a process, and that catching people at that perfect moment is next-to-impossible. They try to assess what the scientist is doing as a whole, not just a single idea.
Scientists ought to be honest about what they’re doing. But grant agencies need to be honest too, about how science in a given field actually works. Hopefully, one enables the other, and we reach a more honest world.
You can think of elliptic integrals as integrals over a torus, a curve shaped like the outer crust of a donut.
Integrals like these are showing up more and more in our field, the subject of bigger and biggerconferences. By now, we think we have a pretty good idea of how to handle them, but there are still some outstanding mysteries to solve.
One such mystery came up in a paper in 2017, by Luise Adams and Stefan Weinzierl. They were working with one of the favorite examples of this community, the so-called sunrise diagram (sunrise being a good time to eat donuts). And they noticed something surprising: if they looked at the sunrise diagram in different ways, it was described by different donuts.
What do I mean, different donuts?
The integrals we know best in this field aren’t integrals on a torus, but rather integrals on a sphere. In some sense, all spheres are the same: you can make them bigger or smaller, but they don’t have different shapes, they’re all “sphere-shaped”. In contrast, integrals on a torus are trickier, because toruses can have different shapes. Think about different donuts: some might have a thin ring, others a thicker one, even if the overall donut is the same size. You can’t just scale up one donut and get the other.
My colleague, Cristian Vergu, was annoyed by this. He’s the kind of person who trusts mathematics like an old friend, one who would never lead him astray. He thought that there must be one answer, one correct donut, one natural way to represent the sunrise diagram mathematically. I was skeptical, I don’t trust mathematics nearly as much as Cristian does. To sort it out, we brought in Hjalte Frellesvig and Matthias Volk, and started trying to write the sunrise diagram every way we possibly could. (Along the way, we threw in another “donut diagram”, the double-box, just to see what would happen.)
Rather than getting a zoo of different donuts, we got a surprise: we kept seeing the same two. And in the end, we stumbled upon the answer Cristian was hoping for: one of these two is, in a meaningful sense, the “correct donut”.
What was wrong with the other donut? It turns out when the original two donuts were found, one of them involved a move that is a bit risky mathematically, namely, combining square roots.
For readers who don’t know what I mean, or why this is risky, let me give a simple example. Everyone else can skip to after the torus gif.
Suppose I am solving a problem, and I find a product of two square roots:
I could try combining them under the same square root sign, like so:
That works, if is positive. But now suppose . Plug in negative one to the first expression, and you get,
while in the second,
In this case, it wasn’t as obvious that combining roots would change the donut. It might have been perfectly safe. It took some work to show that indeed, this was the root of the problem. If the roots are instead combined more carefully, then one of the donuts goes away, leaving only the one, true donut.
I’m interested in seeing where this goes, how many different donuts we have to understand and how they might be related. But I’ve also been writing about donuts for the last hour or so, so I’m getting hungry. See you next week!
One of the most mysterious powers physicists claim is physical intuition. Let the mathematicians have their rigorous proofs and careful calculations. We just need to ask ourselves, “Does this make sense physically?”
It’s tempting to chalk this up to bluster, or physicist arrogance. Sometimes, though, a physicist manages to figure out something that stumps the mathematicians. Edward Witten’s work on knot theory is a classic example, where he used ideas from physics, not rigorous proof, to win one of mathematics’ highest honors.
So what is physical intuition? And what is its relationship to proof?
Oscillators are familiar problems for first-year physics students. Objects that go back and forth, like springs and pendulums, tend to obey similar equations. Link two of them together (couple them), and the equations get more complicated, work for a second-year student instead of a first-year one. Such a student will notice that coupled oscillators “repel” each other: their frequencies get father apart than they would be if they weren’t coupled.
Our seminar speaker wanted us to revisit those second-year-student days, in order to understand how different particles behave in Effective Field Theory. Just as the frequencies of the oscillators repel each other, the energies of particles repel each other: the unknown high-energy particles could only push the energies of the lighter particles we can detect lower, not higher.
This is an example of physical intuition. Examine it, and you can learn a few things about how physical intuition works.
First, physical intuition comes from experience. Using physical intuition wasn’t just a matter of imagining the particles and trying to see what “makes sense”. Instead, it required thinking about similar problems from our experience as physicists: problems that don’t just seem similar on the surface, but are mathematically similar.
Second, physical intuition doesn’t replace calculation. Our speaker had done the math, he hadn’t just made a physical argument. Instead, physical intuition serves two roles: to inspire, and to help remember. Physical intuition can inspire new solutions, suggesting ideas that you go on to check with calculation. In addition to that, it can help your mind sort out what you already know. Without the physical story, we might not have remembered that the low-energy particles have their energies pushed down. With the story though, we had a similar problem to compare, and it made the whole thing more memorable. Human minds aren’t good at holding a giant pile of facts. What they are good at is holding narratives. “Physical intuition” ties what we know into a narrative, building on past problems to understand new ones.
Finally, physical intuition can be risky. If the problem is too different then the intuition can lead you astray. The mathematics of coupled oscillators and Effective Field Theories was similar enough for this argument to work, but if it turned out to be different in an important way then the intuition would have backfired, making it harder to find the answer and harder to keep track once it was found.
Physical intuition may seem mysterious. But deep down, it’s just physicists using our experience, comparing similar problems to help keep track of what we need to know. I’m sure chemists, biologists, and mathematicians all have similar stories to tell.
Me, I chose physics as a career, so I’d better like it. And you, right now you’re reading a physics blog for fun, so you probably like physics too.
Ok, so we agree, physics is awesome. But it isn’t always awesome.
Read a blog like this, or the news, and you’ll hear about the more awesome parts of physics: the black holes and big bangs, quantum mysteries and elegant mathematics. As freshman physics majors learn every year, most of physics isn’t like that. It’s careful calculation and repetitive coding, incremental improvements to a piece of a piece of a piece of something that might eventually answer a Big Question. Even if intellectually you can see the line from what you’re doing to the big flashy stuff, emotionally the two won’t feel connected, and you might struggle to feel motivated.
Physics solves this through acculturation. Physicists don’t just work on their own, they’re part of a shared worldwide culture of physicists. They spend time with other physicists, and not just working time but social time: they eat lunch together, drink coffee together, travel to conferences together. Spending that time together gives physics more emotional weight: as humans, we care a bit about Big Questions, but we care a lot more about our community.
This isn’t unique to physics, of course, or even to academics. Programmers who have lunch together, philanthropists who pat each other on the back for their donations, these people are trying to harness the same forces. By building a culture around something, you can get people more motivated to do it.
There’s a risk here, of course, that the culture takes over, and we lose track of the real reasons to do science. It’s easy to care about something because your friends care about it because their friends care about it, looping around until it loses contact with reality. In science we try to keep ourselves grounded, to respect those who puncture our bubbles with a good argument or a clever experiment. But we don’t always succeed.
The pandemic has made acculturation more difficult. As a scientist working from home, that extra bit of social motivation is much harder to get. It’s perhaps even harder for new students, who haven’t had the chance to hang out and make friends with other researchers. People’s behavior, what they research and how and when, has changed, and I suspect changing social ties are a big part of it.
In the long run, I don’t think we can do without the culture of physics. We can’t be lone geniuses motivated only by our curiosity, that’s just not how people work. We have to meld the two, mix the social with the intellectual…and hope that when we do, we keep the engines of discovery moving.
Physics is universal…or at least, it aspires to be. Drop an apple anywhere on Earth, at any point in history, and it will accelerate at roughly the same rate. When we call something a law of physics, we expect it to hold everywhere in the universe. It shouldn’t depend on anything arbitrary.
Sometimes, though, something arbitrary manages to sneak in. Even if the laws of physics are universal, the questions we want to answer are not: they depend on our situation, on what we want to know.
The simplest example is when we have to use units. The mass of an electron is the same here as it is on Alpha Centauri, the same now as it was when the first galaxies formed. But what is that mass? We could write it as 9.1093837015×10−31 kilograms, if we wanted to, but kilograms aren’t exactly universal. Their modern definition is at least based on physical constants, but with some pretty arbitrary numbers. It defines the Planck constant as 6.62607015×10−34 Joule-seconds. Chase that number back, and you’ll find references to the Earth’s circumference and the time it takes to turn round on its axis. The mass of the electron may be the same on Alpha Centauri, but they’d never write it as 9.1093837015×10−31 kilograms.
So what do we do, when something arbitrary sneaks in? We have a few options. I’ll illustrate each with the mass of the electron:
Make an arbitrary choice, and stick with it: There’s nothing wrong with measuring an electron in kilograms, if you’re consistent about it. You could even use ounces. You just have to make sure that everyone else you compare with is using the same units, or be careful to convert.
Make a “natural” choice: Why not set the speed of light and Planck’s constant to one? They come up a lot in particle physics, and all they do is convert between length and time, or time and energy. That way you can use the same units for all of them, and use something convenient, like electron-Volts. They even have electron in the name! Of course they also have “Volt” in the name, and Volts are as arbitrary as any other metric unit. A “natural” choice might make your life easier, but you should always remember it’s still arbitrary.
Make an efficient choice: This isn’t always the same as the “natural” choice. The units you choose have an effect on how difficult your calculation is. Sometimes, the best choice for the mass of an electron is “one electron-mass”, because it lets you calculate something else more easily. This is easier to illustrate with other choices: for example, if you have to pick a reference frame for a collision, picking one in which one of the objects is at rest, or where they move symmetrically, might make your job easier.
Stick to questions that aren’t arbitrary: No matter what units we use, the electron’s mass will be arbitrary. Its ratios to other masses won’t be though. No matter where we measure, dimensionless ratios like the mass of the muon divided by the mass of the electron, or the mass of the electron divided by the value of the Higgs field, will be the same. If we can make sure to ask only this kind of question, we can avoid arbitrariness. Note that we can think of even a mass in “kilograms” as this kind of question: what’s the ratio of the mass of the electron to “this arbitrary thing we’ve chosen”? In practice though, you want to compare things in the same theory, without the historical baggage of metric.
This problem may seem silly, and if we just cared about units it might be. But at the cutting-edge of physics there are still areas where the arbitrary shows up. Our choices of how to handle it, or how to avoid it, can be crucial to further progress.
I grew up in the US. I’ve roamed over the years, but each year I’ve managed to come back around this time. My folks throw the kind of Thanksgiving you see in movies, a table overflowing with turkey and nine kinds of pie.
This year, obviously, is different. No travel, no big party. Still, I wanted to capture some of the feeling here in my cozy Copenhagen apartment. My wife and I baked mini-pies instead, a little feast just for us two.
In these weird times, it’s good to have the occasional taste of normal, a dose of tradition to feel more at home. That doesn’t just apply to personal life, but to academic life as well.
One tradition among academics is the birthday conference. Often timed around a 60th birthday, birthday conferences are a way to celebrate the achievements of professors who have made major contributions to a field. There are talks by their students and close collaborators, filled with stories of the person being celebrated.
Last week was one such conference, in honor of one of the pioneers of my field, Dirk Kreimer. The conference was Zoom-based, and it was interesting to compare with the other ZoomconferencesI’ve seen this year. One thing that impressed me was how they handled the “social side” of the conference. Instead of a Slack space like the other conferences, they used a platform called Gather. Gather gives people avatars on a 2D map, mocked up to look like an old-school RPG. Walk close to a group of people, and it lets you video chat with them. There are chairs and tables for private conversations, whiteboards to write on, and in this case even a birthday card to sign.
I didn’t get a chance to try Gather. My guess is it’s a bit worse than Slack for some kinds of discussion. Start a conversation in a Slack channel and people can tune in later from other time zones, each posting new insights and links to references. It’s a good way to hash out an idea.
But a birthday conference isn’t really about hashing out ideas. It’s about community and familiarity, celebrating people we care about. And for that purpose, Gather seems great. You want that little taste of normalcy, of walking across the room and seeing a familiar face, chatting with the folks you keep seeing year after year.
I’ve mused a bit about what it takes to do science when we can’t meet in person. Part of that is a question of efficiency: what does it take it get research done? But if we focus too much on that, we might forget the role of culture. Scientists are people, we form a community, and part of what we value is comfort and familiarity. Keeping that community alive means not just good research discussions, but traditions as well, ways of referencing things we’ve done to carry forward to new circumstances. We will keep changing, our practices will keep evolving. But if we want those changes to stick, we should tie them to the past too. We should keep giving people those comforting tastes of normal.