Tag Archives: academia

Visiting the IAS

I’m at the Institute for Advanced Study, or IAS, this week.

There isn’t a conference going on, but if you looked at the visitor list you’d be forgiven for thinking there was. We have talks in my subfield almost every day this week, two professors from my subfield here on sabbatical, and extra visitors on top of that.

The IAS is a bit of an odd place. Partly, that’s due to its physical isolation: tucked away in the woods behind Princeton, a half-hour’s walk from the nearest restaurant, it’s supposed to be a place for contemplation away from the hustle and bustle of the world.

Since the last time I visited they’ve added a futuristic new building, seen here out of my office window. The building is most notable for one wild promise: someday, they will serve dinner there.

Mostly, though, the weirdness of the IAS is due to the kind of institution it is.

Within a given country, most universities are pretty similar. Each may emphasize different teaching styles, and the US has a distinction between public and private, but (neglecting scammy for-profit universities), there are some commonalities of structure: both how they’re organized, and how they’re funded. Even between countries, different university systems have quite a bit of overlap.

The IAS, though, is not a university. It’s an independent institute. Neighboring Princeton supplies it with PhD students, but otherwise the IAS runs, and funds, itself.

There are a few other places like that around the world. The Perimeter Institute in Canada is also independent, and also borrows students from a neighboring university. CERN pools resources from several countries across Europe and beyond, Nordita from just the Nordic countries. Generalizing further, many countries have some sort of national labs or other nation-wide systems, from US Department of Energy labs like SLAC to Germany’s Max Planck Institutes.

And while universities share a lot in common, non-university institutes can be very different. Some are closely tied to a university, located inside university buildings with members with university affiliations. Others sit at a greater remove, less linked to a university or not linked at all. Some have their own funding, investments or endowments or donations, while others are mostly funded by governments, or groups of governments. I’ve heard that the IAS gets about 10% of its budget from the government, while Perimeter gets its everyday operating expenses entirely from the Canadian government and uses donations for infrastructure and the like.

So ultimately, the IAS is weird because every organization like it is weird. There are a few templates, and systems, but by and large each independent research organization is different. Understanding one doesn’t necessarily help at understanding another.

Fields and Scale

I am a theoretical particle physicist, and every morning I check the arXiv.

arXiv.org is a type of website called a preprint server. It’s where we post papers before they are submitted to (and printed by) a journal. In practice, everything in our field shows up on arXiv, publicly accessible, before it appears anywhere else. There’s no peer review process on arXiv, the journals still handle that, but in our field peer review doesn’t often notice substantive errors. So in practice, we almost never read the journals: we just check arXiv.

And so every day, I check the arXiv. I go to the section on my sub-field, and I click on a link that lists all of the papers that were new that day. I skim the titles, and if I see an interesting paper I’ll read the abstract, and maybe download the full thing. Checking as I’m writing this, there were ten papers posted in my field, and another twenty “cross-lists” were posted in other fields but additionally classified in mine.

Other fields use arXiv: mathematicians and computer scientists and even economists use it in roughly the same way physicists do. For biology and medicine, though, there are different, newer sites: bioRxiv and medRxiv.

One thing you may notice is the different capitalization. When physicists write arXiv, the “X” is capitalized. In the logo, it looks like a Greek letter chi, thus saying “archive”. The biologists and medical researchers capitalize the R instead. The logo still has an X that looks like a chi, but positioned with the R it looks like the Rx of medical prescriptions.

Something I noticed, but you might not, was the lack of a handy link to see new papers. You can search medRxiv and bioRxiv, and filter by date. But there’s no link that directly takes you to the newest papers. That suggests that biologists aren’t using bioRxiv like we use arXiv, and checking the new papers every day.

I was curious if this had to do with the scale of the field. I have the impression that physics and mathematics are smaller fields than biology, and that much less physics and mathematics research goes on than medical research. Certainly, theoretical particle physics is a small field. So I might have expected arXiv to be smaller than bioRxiv and medRxiv, and I certainly would expect fewer papers in my sub-field than papers in a medium-sized subfield of biology.

On the other hand, arXiv in my field is universal. In biology, bioRxiv and medRxiv are still quite controversial. More and more people are using them, but not every journal accepts papers posted to a preprint server. Many people still don’t use these services. So I might have expected bioRxiv and medRxiv to be smaller.

Checking now, neither answer is quite right. I looked between November 1 and November 2, and asked each site how many papers were uploaded between those dates. arXiv had the most, 604 papers. bioRxiv had roughly half that many, 348. medRxiv had 97.

arXiv represents multiple fields, bioRxiv is “just” biology. Specializing, on that day arXiv had 235 physics papers, 135 mathematics papers, and 250 computer science papers. So each individual field has fewer papers than biology in this period.

Specializing even further, I can look at a subfield. My subfield, which is fairly small, had 20 papers between those dates. Cell biology, which I would expect to be quite a big subfield, had 33.

Overall, the numbers were weirdly comparable, with medRxiv unexpectedly small compared to both arXiv and bioRxiv. I’m not sure whether there are more biologists than physicists, but I’m pretty sure there should be more cell biologists than theoretical particle physicists. This suggests that many still aren’t using bioRxiv. It makes me wonder: will bioRxiv grow dramatically in future? Are the people running it ready for if it does?

No, PhD Students Are Not Just Cheap Labor

Here’s a back-of-the-envelope calculation:

In 2019, there were 83,050 unionized graduate students in the US. Let’s assume these are mostly PhD students, since other graduate students are not usually university employees. I can’t find an estimate of the total number of PhD students in the US, but in 2019, 55,614 of them graduated. In 2020, the average US doctorate took 7.5 years to complete. That implies that 83,050/(55,614 x 7.5) = about one-fifth of PhD students in the US are part of a union.

That makes PhD student unions common, but not the majority. It means they’re not unheard of and strange, but a typical university still isn’t unionized. It’s the sweet spot for controversy. It leads to a lot of dumb tweets.

I saw one such dumb tweet recently, from a professor arguing that PhD students shouldn’t unionize. The argument was that if PhD students were paid more, then professors would prefer to hire postdocs, researchers who already have a doctoral degree.

(I won’t link to the tweet, in part because this person is probably being harassed enough already.)

I don’t know how things work in this professor’s field. But the implication, that professors primarily take on PhD students because they’re cheaper, not only doesn’t match my experience: it also just doesn’t make very much sense.

Imagine a neighborhood where the children form a union. They decide to demand a higher allowance, and to persuade any new children in the neighborhood to follow their lead.

Now imagine a couple in that neighborhood, deciding whether to have a child. Do you think that they might look at the fees the “children’s union” charges, and decide to hire an adult to do their chores instead?

Maybe there’s a price where they’d do that. If neighborhood children demanded thousands of dollars in allowance, maybe the young couple would decide that it’s too expensive to have a child. But a small shift is unlikely to change things very much: people have kids for many reasons, and those reasons don’t usually include cheap labor.

The reasons professors take on PhD students are similar to the reasons parents decide to have children. Some people have children because they want a legacy, something of theirs that survives to the next generation. For professors, PhD students are our legacy, our chance to raise someone on our ideas and see how they build on them. Some people have children because they love the act of child-raising: helping someone grow and learn about the world. The professors who take on students like taking on students: teaching is fun, after all.

That doesn’t mean there won’t be cases “on the margin”, where a professor finds they can’t afford a student they previously could. (And to be fair to the tweet I’m criticizing, they did even use the word “marginal”.) But they would have to be in a very tight funding situation, with very little flexibility.

And even for situations like that, long-term, I’m not sure anything would change.

I did my PhD in the US. I was part of a union, and in part because of that (though mostly because I was in a physics department), I was paid relatively decently for a PhD student. Relatively decently is still not that great, though. This was the US, where universities still maintain the fiction that PhD students only work 20 hours a week and pay proportionate to that, and where salaries in a university can change dramatically from student to postdoc to professor.

One thing I learned during my PhD is that despite our low-ish salaries, we cost our professors about as much as postdocs did. The reason why is tuition: PhD students don’t pay their own tuition, but that tuition still exists, and is paid by the professors who hire those students out of their grants. A PhD salary plus a PhD tuition ended up roughly equal to a postdoc salary.

Now, I’m working in a very different system. In a Danish university, wages are very flat. As a postdoc, a nice EU grant put me at almost the same salary as the professors. As a professor, my salary is pretty close to that of one of the better-paying schoolteacher jobs.

At the same time, tuition is much less relevant. Undergraduates don’t pay tuition at all, so PhD tuition isn’t based on theirs. Instead, it’s meant to cover costs of the PhD program as a whole.

I’ve filled out grants here in Denmark, so I know how much PhD students cost, and how much postdocs cost. And since the situation is so different, you might expect a difference here too.

There isn’t one. Hiring a PhD student, salary plus tuition, costs about as much as hiring a postdoc.

Two very different systems, with what seem to be very different rules, end up with the same equation. PhD students and postdocs cost about as much as each other, even if every assumption that you think would affect the outcome turns out completely different.

This is why I expect that, even if PhD students get paid substantially more, they still won’t end up that out of whack with postdocs. There appears to be an iron law of academic administration keeping these two numbers in line, one that holds across nations and cultures and systems. The proportion of unionized PhD students in the US will keep working its way upwards, and I don’t expect it to have any effect on whether professors take on PhDs.

From Journal to Classroom

As part of the pedagogy course I’ve been taking, I’m doing a few guest lectures in various courses. I’ve got one coming up in a classical mechanics course (“intermediate”-level, so not Newton’s laws, but stuff the general public doesn’t know much about like Hamiltonians). They’ve been speeding through the core content, so I got to cover a “fun” topic, and after thinking back to my grad school days I chose a topic I think they’ll have a lot of fun with: Chaos theory.

Getting the obligatory Warhammer reference out of the way now

Chaos is one of those things everyone has a vague idea about. People have heard stories where a butterfly flaps its wings and causes a hurricane. Maybe they’ve heard of the rough concept, determinism with strong dependence on the initial conditions, so a tiny change (like that butterfly) can have huge consequences. Maybe they’ve seen pictures of fractals, and got the idea these are somehow related.

Its role in physics is a bit more detailed. It’s one of those concepts that “intermediate classical mechanics” is good for, one that can be much better understood once you’ve been introduced to some of the nineteenth century’s mathematical tools. It felt like a good way to show this class that the things they’ve learned aren’t just useful for dusty old problems, but for understanding something the public thinks is sexy and mysterious.

As luck would have it, the venerable textbook the students are using includes a (2000’s era) chapter on chaos. I read through it, and it struck me that it’s a very different chapter from most of the others. This hit me particularly when I noticed a section describing a famous early study of chaos, and I realized that all the illustrations were based on the actual original journal article.

I had surprisingly mixed feelings about this.

On the one hand, there’s a big fashion right now for something called research-based teaching. That doesn’t mean “using teaching methods that are justified by research” (though you’re supposed to do that too), but rather, “tying your teaching to current scientific research”. This is a fashion that makes sense, because learning about cutting-edge research in an undergraduate classroom feels pretty cool. It lets students feel more connected with the scientific community, it inspires them to get involved, and it gets them more used to what “real research” looks like.

On the other hand, structuring your textbook based on the original research papers feels kind of lazy. There’s a reason we don’t teach Newtonian mechanics the way Newton would have. Pedagogy is supposed to be something we improve at over time: we come up with better examples and better notation, more focused explanations that teach what we want students to learn. If we just summarize a paper, we’re not really providing “added value”: we should hope, at this point, that we can do better.

Thinking about this, I think the distinction boils down to why you’re teaching the material in the first place.

With a lot of research-based teaching, the goal is to show the students how to interact with current literature. You want to show them journal papers, not because the papers are the best way to teach a concept or skill, but because reading those papers is one of the skills you want to teach.

That makes sense for very current topics, but it seems a bit weird for the example I’ve been looking at, an early study of chaos from the 60’s. It’s great if students can read current papers, but they don’t necessarily need to read older ones. (At least, not yet.)

What then, is the textbook trying to teach? Here things get a bit messy. For a relatively old topic, you’d ideally want to teach not just a vague impression of what was discovered, but concrete skills. Here though, those skills are just a bit beyond the students’ reach: chaos is more approachable than you’d think, but still not 100% something the students can work with. Instead they’re learning to appreciate concepts. This can be quite valuable, but it doesn’t give the kind of structure that a concrete skill does. In particular, it makes it hard to know what to emphasize, beyond just summarizing the original article.

In this case, I’ve come up with my own way forward. There are actually concrete skills I’d like to teach. They’re skills that link up with what the textbook is teaching, skills grounded in the concepts it’s trying to convey, and that makes me think I can convey them. It will give some structure to the lesson, a focus on not merely what I’d like the students to think but what I’d like them to do.

I won’t go into too much detail: I suspect some of the students may be reading this, and I don’t want to spoil the surprise! But I’m looking forward to class, and to getting to try another pedagogical experiment.

Machine Learning, Occam’s Razor, and Fundamental Physics

There’s a saying in physics, attributed to the famous genius John von Neumann: “With four parameters I can fit an elephant, and with five I can make him wiggle his trunk.”

Say you want to model something, like some surprising data from a particle collider. You start with some free parameters: numbers in your model that aren’t decided yet. You then decide those numbers, “fixing” them based on the data you want to model. Your goal is for your model not only to match the data, but to predict something you haven’t yet measured. Then you can go out and check, and see if your model works.

The more free parameters you have in your model, the easier this can go wrong. More free parameters make it easier to fit your data, but that’s because they make it easier to fit any data. Your model ends up not just matching the physics, but matching the mistakes as well: the small errors that crop up in any experiment. A model like that may look like it’s a great fit to the data, but its predictions will almost all be wrong. It wasn’t just fit, it was overfit.

We have statistical tools that tell us when to worry about overfitting, when we should be impressed by a model and when it has too many parameters. We don’t actually use these tools correctly, but they still give us a hint of what we actually want to know, namely, whether our model will make the right predictions. In a sense, these tools form the mathematical basis for Occam’s Razor, the idea that the best explanation is often the simplest one, and Occam’s Razor is a critical part of how we do science.

So, did you know machine learning was just modeling data?

All of the much-hyped recent advances in artificial intelligence, GPT and Stable Diffusion and all those folks, at heart they’re all doing this kind of thing. They start out with a model (with a lot more than five parameters, arranged in complicated layers…), then use data to fix the free parameters. Unlike most of the models physicists use, they can’t perfectly fix these numbers: there are too many of them, so they have to approximate. They then test their model on new data, and hope it still works.

Increasingly, it does, and impressively well, so well that the average person probably doesn’t realize this is what it’s doing. When you ask one of these AIs to make an image for you, what you’re doing is asking what image the model predicts would show up captioned with your text. It’s the same sort of thing as asking an economist what their model predicts the unemployment rate will be when inflation goes up. The machine learning model is just way, way more complicated.

As a physicist, the first time I heard about this, I had von Neumann’s quote in the back of my head. Yes, these machines are dealing with a lot more data, from a much more complicated reality. They literally are trying to fit elephants, even elephants wiggling their trunks. Still, the sheer number of parameters seemed fishy here. And for a little bit things seemed even more fishy, when I learned about double descent.

Suppose you start increasing the number of parameters in your model. Initially, your model gets better and better. Your predictions have less and less error, your error descends. Eventually, though, the error increases again: you have too many parameters so you’re over-fitting, and your model is capturing accidents in your data, not reality.

In machine learning, weirdly, this is often not the end of the story. Sometimes, your prediction error rises, only to fall once more, in a double descent.

For a while, I found this deeply disturbing. The idea that you can fit your data, start overfitting, and then keep overfitting, and somehow end up safe in the end, was terrifying. The way some of the popular accounts described it, like you were just overfitting more and more and that was fine, was baffling, especially when they seemed to predict that you could keep adding parameters, keep fitting tinier and tinier fleas on the elephant’s trunk, and your predictions would never start going wrong. It would be the death of Occam’s Razor as we know it, more complicated explanations beating simpler ones off to infinity.

Luckily, that’s not what happens. And after talking to a bunch of people, I think I finally understand this enough to say something about it here.

The right way to think about double descent is as overfitting prematurely. You do still expect your error to eventually go up: your model won’t be perfect forever, at some point you will really overfit. It might take a long time, though: machine learning people are trying to model very complicated things, like human behavior, with giant piles of data, so very complicated models may often be entirely appropriate. In the meantime, due to a bad choice of model, you can accidentally overfit early. You will eventually overcome this, pushing past with more parameters into a model that works again, but for a little while you might convince yourself, wrongly, that you have nothing more to learn.

(You can even mitigate this by tweaking your setup, potentially avoiding the problem altogether.)

So Occam’s Razor still holds, but with a twist. The best model is simple enough, but no simpler. And if you’re not careful enough, you can convince yourself that a too-simple model is as complicated as you can get.

Image from Astral Codex Ten

I was reminded of all this recently by some articles by Sabine Hossenfelder.

Hossenfelder is a critic of mainstream fundamental physics. The articles were her restating a point she’s made many times before, including in (at least) one of her books. She thinks the people who propose new particles and try to search for them are wasting time, and the experiments motivated by those particles are wasting money. She’s motivated by something like Occam’s Razor, the need to stick to the simplest possible model that fits the evidence. In her view, the simplest models are those in which we don’t detect any more new particles any time soon, so those are the models she thinks we should stick with.

I tend to disagree with Hossenfelder. Here, I was oddly conflicted. In some of her examples, it seemed like she had a legitimate point. Others seemed like she missed the mark entirely.

Talk to most astrophysicists, and they’ll tell you dark matter is settled science. Indeed, there is a huge amount of evidence that something exists out there in the universe that we can’t see. It distorts the way galaxies rotate, lenses light with its gravity, and wiggled the early universe in pretty much the way you’d expect matter to.

What isn’t settled is whether that “something” interacts with anything else. It has to interact with gravity, of course, but everything else is in some sense “optional”. Astroparticle physicists use satellites to search for clues that dark matter has some other interactions: perhaps it is unstable, sometimes releasing tiny signals of light. If it did, it might solve other problems as well.

Hossenfelder thinks this is bunk (in part because she thinks those other problems are bunk). I kind of do too, though perhaps for a more general reason: I don’t think nature owes us an easy explanation. Dark matter isn’t obligated to solve any of our other problems, it just has to be dark matter. That seems in some sense like the simplest explanation, the one demanded by Occam’s Razor.

At the same time, I disagree with her substantially more on collider physics. At the Large Hadron Collider so far, all of the data is reasonably compatible with the Standard Model, our roughly half-century old theory of particle physics. Collider physicists search that data for subtle deviations, one of which might point to a general discrepancy, a hint of something beyond the Standard Model.

While my intuitions say that the simplest dark matter is completely dark, they don’t say that the simplest particle physics is the Standard Model. Back when the Standard Model was proposed, people might have said it was exceptionally simple because it had a property called “renormalizability”, but these days we view that as less important. Physicists like Ken Wilson and Steven Weinberg taught us to view theories as a kind of series of corrections, like a Taylor series in calculus. Each correction encodes new, rarer ways that particles can interact. A renormalizable theory is just the first term in this series. The higher terms might be zero, but they might not. We even know that some terms cannot be zero, because gravity is not renormalizable.

The two cases on the surface don’t seem that different. Dark matter might have zero interactions besides gravity, but it might have other interactions. The Standard Model might have zero corrections, but it might have nonzero corrections. But for some reason, my intuition treats the two differently: I would find it completely reasonable for dark matter to have no extra interactions, but very strange for the Standard Model to have no corrections.

I think part of where my intuition comes from here is my experience with other theories.

One example is a toy model called sine-Gordon theory. In sine-Gordon theory, this Taylor series of corrections is a very familiar Taylor series: the sine function! If you go correction by correction, you’ll see new interactions and more new interactions. But if you actually add them all up, something surprising happens. Sine-Gordon turns out to be a special theory, one with “no particle production”: unlike in normal particle physics, in sine-Gordon particles can neither be created nor destroyed. You would never know this if you did not add up all of the corrections.

String theory itself is another example. In string theory, elementary particles are replaced by strings, but you can think of that stringy behavior as a series of corrections on top of ordinary particles. Once again, you can try adding these things up correction by correction, but once again the “magic” doesn’t happen until the end. Only in the full series does string theory “do its thing”, and fix some of the big problems of quantum gravity.

If the real world really is a theory like this, then I think we have to worry about something like double descent.

Remember, double descent happens when our models can prematurely get worse before getting better. This can happen if the real thing we’re trying to model is very different from the model we’re using, like the example in this explainer that tries to use straight lines to match a curve. If we think a model is simpler because it puts fewer corrections on top of the Standard Model, then we may end up rejecting a reality with infinite corrections, a Taylor series that happens to add up to something quite nice. Occam’s Razor stops helping us if we can’t tell which models are really the simple ones.

The problem here is that every notion of “simple” we can appeal to here is aesthetic, a choice based on what makes the math look nicer. Other sciences don’t have this problem. When a biologist or a chemist wants to look for the simplest model, they look for a model with fewer organisms, fewer reactions…in the end, fewer atoms and molecules, fewer of the building-blocks given to those fields by physics. Fundamental physics can’t do this: we build our theories up from mathematics, and mathematics only demands that we be consistent. We can call theories simpler because we can write them in a simple way (but we could write them in a different way too). Or we can call them simpler because they look more like toy models we’ve worked with before (but those toy models are just a tiny sample of all the theories that are possible). We don’t have a standard of simplicity that is actually reliable.

From the Wikipedia page for dark matter halos

There is one other way out of this pickle. A theory that is easier to write down is under no obligation to be true. But it is more likely to be useful. Even if the real world is ultimately described by some giant pile of mathematical parameters, if a simple theory is good enough for the engineers then it’s a better theory to aim for: a useful theory that makes peoples’ lives better.

I kind of get the feeling Hossenfelder would make this objection. I’ve seen her argue on twitter that scientists should always be able to say what their research is good for, and her Guardian article has this suggestive sentence: “However, we do not know that dark matter is indeed made of particles; and even if it is, to explain astrophysical observations one does not need to know details of the particles’ behaviour.”

Ok yes, to explain astrophysical observations one doesn’t need to know the details of dark matter particles’ behavior. But taking a step back, one doesn’t actually need to explain astrophysical observations at all.

Astrophysics and particle physics are not engineering problems. Nobody out there is trying to steer a spacecraft all the way across a galaxy, navigating the distribution of dark matter, or creating new universes and trying to make sure they go just right. Even if we might do these things some day, it will be so far in the future that our attempts to understand them won’t just be quaint: they will likely be actively damaging, confusing old research in dead languages that the field will be better off ignoring to start from scratch.

Because of that, usefulness is also not a meaningful guide. It cannot tell you which theories are more simple, which to favor with Occam’s Razor.

Hossenfelder’s highest-profile recent work falls afoul of one or the other of her principles. Her work on the foundations of quantum mechanics could genuinely be useful, but there’s no reason aside from claims of philosophical beauty to expect it to be true. Her work on modeling dark matter is at least directly motivated by data, but is guaranteed to not be useful.

I’m not pointing this out to call Hossenfelder a hypocrite, as some sort of ad hominem or tu quoque. I’m pointing this out because I don’t think it’s possible to do fundamental physics today without falling afoul of these principles. If you want to hold out hope that your work is useful, you don’t have a great reason besides a love of pretty math: otherwise, anything useful would have been discovered long ago. If you just try to model existing data as best you can, then you’re making a model for events far away or locked in high-energy particle colliders, a model no-one else besides other physicists will ever use.

I don’t know the way through this. I think if you need to take Occam’s Razor seriously, to build on the same foundations that work in every other scientific field…then you should stop doing fundamental physics. You won’t be able to make it work. If you still need to do it, if you can’t give up the sub-field, then you should justify it on building capabilities, on the kind of “practice” Hossenfelder also dismisses in her Guardian piece.

We don’t have a solid foundation, a reliable notion of what is simple and what isn’t. We have guesses and personal opinions. And until some experiment uncovers some blinding flash of new useful meaningful magic…I don’t think we can do any better than that.

Amplitudes 2022 Retrospective

I’m back from Amplitudes 2022 with more time to write, and (besides the several papers I’m working on) that means writing about the conference! Casual readers be warned, there’s no way around this being a technical post, I don’t have the space to explain everything!

I mostly said all I wanted about the way the conference was set up in last week’s post, but one thing I didn’t say much about was the conference dinner. Most conference dinners are the same aside from the occasional cool location or haggis speech. This one did have a cool location, and a cool performance by a blind pianist, but the thing I really wanted to comment on was the setup. Typically, the conference dinner at Amplitudes is a sit-down affair: people sit at tables in one big room, maybe getting up occasionally to pick up food, and eventually someone gives an after-dinner speech. This time the tables were standing tables, spread across several rooms. This was a bit tiring on a hot day, but it did have the advantage that it naturally mixed people around. Rather than mostly talking to “your table”, you’d wander, ending up at a new table every time you picked up new food or drinks. It was a good way to meet new people, a surprising number of which in my case apparently read this blog. It did make it harder to do an after-dinner speech, so instead Lance gave an after-conference speech, complete with the now-well-established running joke where Greta Thunberg tries to get us to fly less.

(In another semi-running joke, the organizers tried to figure out who had attended the most of the yearly Amplitudes conferences over the years. Weirdly, no-one has attended all twelve.)

In terms of the content, and things that stood out:

Nima is getting close to publishing his newest ‘hedron, the surfacehedron, and correspondingly was able to give a lot more technical detail about it. (For his first and most famous amplituhedron, see here.) He still didn’t have enough time to explain why he has to use category theory to do it, but at least he was concrete enough that it was reasonably clear where the category theory was showing up. (I wasn’t there for his eight-hour lecture at the school the week before, maybe the students who stuck around until 2am learned some category theory there.) Just from listening in on side discussions, I got the impression that some of the ideas here actually may have near-term applications to computing Feynman diagrams: this hasn’t been a feature of previous ‘hedra and it’s an encouraging development.

Alex Edison talked about progress towards this blog’s namesake problem, the question of whether N=8 supergravity diverges at seven loops. Currently they’re working at six loops on the N=4 super Yang-Mills side, not yet in a form it can be “double-copied” to supergravity. The tools they’re using are increasingly sophisticated, including various slick tricks from algebraic geometry. They are looking to the future: if they’re hoping their methods will reach seven loops, the same methods have to make six loops a breeze.

Xi Yin approached a puzzle with methods from String Field Theory, prompting the heretical-for-us title “on-shell bad, off-shell good”. A colleague reminded me of a local tradition for dealing with heretics.

While Nima was talking about a new ‘hedron, other talks focused on the original amplituhedron. Paul Heslop found that the amplituhedron is not literally a positive geometry, despite slogans to the contrary, but what it is is nonetheless an interesting generalization of the concept. Livia Ferro has made more progress on her group’s momentum amplituhedron: previously only valid at tree level, they now have a picture that can accomodate loops. I wasn’t sure this would be possible, there are a lot of things that work at tree level and not for loops, so I’m quite encouraged that this one made the leap successfully.

Sebastian Mizera, Andrew McLeod, and Hofie Hannesdottir all had talks that could be roughly summarized as “deep principles made surprisingly useful”. Each took topics that were explored in the 60’s and translated them into concrete techniques that could be applied to modern problems. There were surprisingly few talks on the completely concrete end, on direct applications to collider physics. I think Simone Zoia’s was the only one to actually feature collider data with error bars, which might explain why I singled him out to ask about those error bars later.

Likewise, Matthias Wilhelm’s talk was the only one on functions beyond polylogarithms, the elliptic functions I’ve also worked on recently. I wonder if the under-representation of some of these topics is due to the existence of independent conferences: in a year when in-person conferences are packed in after being postponed across the pandemic, when there are already dedicated conferences for elliptics and practical collider calculations, maybe people are just a bit too tired to go to Amplitudes as well.

Talks on gravitational waves seem to have stabilized at roughly a day’s worth, which seems reasonable. While the subfield’s capabilities continue to be impressive, it’s also interesting how often new conceptual challenges appear. It seems like every time a challenge to their results or methods is resolved, a new one shows up. I don’t know whether the field will ever get to a stage of “business as usual”, or whether it will be novel qualitative questions “all the way up”.

I haven’t said much about the variety of talks bounding EFTs and investigating their structure, though this continues to be an important topic. And I haven’t mentioned Lance Dixon’s talk on antipodal duality, largely because I’m planning a post on it later: Quanta Magazine had a good article on it, but there are some aspects even Quanta struggled to cover, and I think I might have a good way to do it.

Covering the Angles

One way to think of science is of a lot of interesting little problems. Some scientists are driven by questions like “how does this weird cell work?” or “how accurately can I predict the chance these particles collide?” If the puzzles are fun enough and the questions are interesting enough, then that can be enough motivation on its own.

Another perspective thinks of science as pursuit of a few big problems. Physicists want to write down the laws of nature, to know where the universe came from, to reconcile gravity and quantum mechanics. Biologists want to understand how life works and manipulate it, psychologists want the same for the human mind. For some scientists, these big questions are at the heart of why they do science. Someone in my field once joked he can’t get up in the morning without telling himself “spacetime is doomed”.

Even if you care about the big questions, though, you can’t neglect the small ones. That’s because modern science is collaborative. A big change, like a new particle or a whole new theory of physics, requires confirmation. It’s not enough for one person to propose it. The ideas that last in science last because they crop up in many different places, with many different methods. They last because we check all the angles, compulsively, looking for any direction that might be screwed up.

In those checks, any and all science can be useful. We need the big conceptual leaps from people like Einstein and the careful and systematic measurements of Brahe. We need people who look for the wackiest ideas, not just because they might be true, but to rule them out when they’re false, to make us all the more confident we’re on the right path. We need people pushing tried-and-true theories to the next leap of precision, to show that nothing is hiding in the gaps and make it clearer when something is. We need many people pushing many different paths: all are necessary, and any one might be crucial.

Often, one of these paths gets the lion’s share of the glory: the press, the Nobel, the mention in the history books. But the other paths still matter: we wouldn’t be confident in the science if they didn’t exist. Most working scientists will be on those other paths, as a matter of course. But we still need them to get science done.

The Conference Dilemma: Freshness vs. Breadth

Back in 2017, I noticed something that should have struck me as a little odd. My sub-field has a big yearly conference, called Amplitudes, that brings in everyone who works on our kind of research. Amplitudes 2017 was fun, but not “fresh”: most people talked about work they had already published. A smaller conference I went to that year, called QCD Meets Gravity, was much “fresher”: a lot of discussion of work in progress and work “hot off the presses”.

At the time, I chalked the difference up to timing: it was a few months later, and people happened to have projects that matured around then. But I realized recently there’s another reason, one why you would expect bigger conferences to have less fresh content.

See, I’ve recently been on the other “side of the curtain”: I was an organizer for Amplitudes last year. And I noticed one big obstacle to having fresh content: the timeframe.

The bigger a conference is, the longer in advance you need to invite speakers. It’s a bigger task to organize everyone, to make sure travel and hotels and raw availability works, that everyone has time to prepare their talks and you have a nice full (but not too full) schedule. So when we started asking people, we didn’t know what the “freshest” work was going to be. We had recommendations from our scientific committee (a group of experts in the subfield whose job is to suggest speakers), but in practice the goal is more one of breadth than freshness: we needed to make sure that everybody in our community was represented.

A smaller conference can get around this. It can be organized a bit later, so the organizers have more information about new developments. It covers a smaller area, so the organizers have more information about new hot topics and unpublished results. And it typically invites most of the sub-community anyway, so you’re guaranteed to cover the hot new stuff just by raw completeness.

This doesn’t mean small conferences are “just better” or anything like that. Breadth is genuinely useful: a big conference covering a whole subfield is great for bringing a community together, getting everyone on a shared page and expanding their horizons. There’s a real tradeoff between those goals and getting a conference with the latest progress. It’s not a fixed tradeoff, we can improve both goals at once (I think at Amplitudes we as organizers could have been better at highlighting unpublished work), but we still have to make choices of what to emphasize.

Einstein-Years

Scott Aaronson recently published an interesting exchange on his blog Shtetl Optimized, between him and cognitive psychologist Steven Pinker. The conversation was about AI: Aaronson is optimistic (though not insanely so) Pinker is pessimistic (again, not insanely though). While fun reading, the whole thing would normally be a bit too off-topic for this blog, except that Aaronson’s argument ended up invoking something I do know a bit about: how we make progress in theoretical physics.

Aaronson was trying to respond to an argument of Pinker’s, that super-intelligence is too vague and broad to be something we could expect an AI to have. Aaronson asks us to imagine an AI that is nothing more or less than a simulation of Einstein’s brain. Such a thing isn’t possible today, and might not even be efficient, but it has the advantage of being something concrete we can all imagine. Aarsonson then suggests imagining that AI sped up a thousandfold, so that in one year it covers a thousand years of Einstein’s thought. Such an AI couldn’t solve every problem, of course. But in theoretical physics, surely such an AI could be safely described as super-intelligent: an amazing power that would change the shape of physics as we know it.

I’m not as sure of this as Aaronson is. We don’t have a machine that generates a thousand Einstein-years to test, but we do have one piece of evidence: the 76 Einstein-years the man actually lived.

Einstein is rightly famous as a genius in theoretical physics. His annus mirabilis resulted in five papers that revolutionized the field, and the next decade saw his theory of general relativity transform our understanding of space and time. Later, he explored what general relativity was capable of and framed challenges that deepened our understanding of quantum mechanics.

After that, though…not so much. For Einstein-decades, he tried to work towards a new unified theory of physics, and as far as I’m aware made no useful progress at all. I’ve never seen someone cite work from that period of Einstein’s life.

Aarsonson mentions simulating Einstein “at his peak”, and it would be tempting to assume that the unified theory came “after his peak”, when age had weakened his mind. But while that kind of thing can sometimes be an issue for older scientists, I think it’s overstated. I don’t think careers peak early because of “youthful brains”, and with the exception of genuine dementia I don’t think older physicists are that much worse-off cognitively than younger ones. The reason so many prominent older physicists go down unproductive rabbit-holes isn’t because they’re old. It’s because genius isn’t universal.

Einstein made the progress he did because he was the right person to make that progress. He had the right background, the right temperament, and the right interests to take others’ mathematics and take them seriously as physics. As he aged, he built on what he found, and that background in turn enabled him to do more great things. But eventually, the path he walked down simply wasn’t useful anymore. His story ended, driven to a theory that simply wasn’t going to work, because given his experience up to that point that was the work that interested him most.

I think genius in physics is in general like that. It can feel very broad because a good genius picks up new tricks along the way, and grows their capabilities. But throughout, you can see the links: the tools mastered at one age that turn out to be just right for a new pattern. For the greatest geniuses in my field, you can see the “signatures” in their work, hints at why they were just the right genius for one problem or another. Give one a thousand years, and I suspect the well would eventually run dry: the state of knowledge would no longer be suitable for even their breadth.

…of course, none of that really matters for Aaronson’s point.

A century of Einstein-years wouldn’t have found the Standard Model or String Theory, but a century of physicist-years absolutely did. If instead of a simulation of Einstein, your AI was a simulation of a population of scientists, generating new geniuses as the years go by, then the argument works again. Sure, such an AI would be much more expensive, much more difficult to build, but the first one might have been as well. The point of the argument is simply to show such a thing is possible.

The core of Aaronson’s point rests on two key traits of technology. Technology is replicable: once we know how to build something, we can build more of it. Technology is scalable: if we know how to build something, we can try to build a bigger one with more resources. Evolution can tap into both of these, but not reliably: just because it’s possible to build a mind a thousand times better at some task doesn’t mean it will.

That is why the possibility of AI leads to the possibility of super-intelligence. If we can make a computer that can do something, we can make it do that something faster. That something doesn’t have to be “general”, you can have programs that excel at one task or another. For each such task, with more resources you can scale things up: so anything a machine can do now, a later machine can probably do better. Your starting-point doesn’t necessarily even have to be efficient, or a good algorithm: bad algorithms will take longer to scale, but could eventually get there too.

The only question at that point is “how fast?” I don’t have the impression that’s settled. The achievements that got Pinker and Aarsonson talking, GPT-3 and DALL-E and so forth, impressed people by their speed, by how soon they got to capabilities we didn’t expect them to have. That doesn’t mean that something we might really call super-intelligence is close: that has to do with the details, with what your target is and how fast you can actually scale. And it certainly doesn’t mean that another approach might not be faster! (As a total outsider, I can’t help but wonder if current ML is in some sense trying to fit a cubic with straight lines.)

It does mean, though, that super-intelligence isn’t inconceivable, or incoherent. It’s just the recognition that technology is a master of brute force, and brute force eventually triumphs. If you want to think about what happens in that “eventually”, that’s a very important thing to keep in mind.

Proxies for Proxies

Why pay scientists?

Maybe you care about science itself. You think that exploring the world should be one of our central goals as human beings, that it “makes our country worth defending”.

Maybe you care about technology. You support science because, down the line, you think it will give us new capabilities that improve people’s lives. Maybe you expect this to happen directly, or maybe indirectly as “spinoff” inventions like the internet.

Maybe you just think science is cool. You want the stories that science tells: they entertain you, they give you a place in the world, they help distract from the mundane day to day grind.

Maybe you just think that the world ought to have scientists in it. You can think of it as a kind of bargain, maintaining expertise so that society can tackle difficult problems. Or you can be more cynical, paying early-career scientists on the assumption that most will leave academia and cheapen labor costs for tech companies.

Maybe you want to pay the scientists to teach, to be professors at universities. You notice that they don’t seem to be happy if you don’t let them research, so you throw a little research funding at them, as a treat.

Maybe you just want to grow your empire: your department, your university, the job numbers in your district.

In most jobs, you’re supposed to do what people pay you to do. As a scientist, the people who pay you have all of these motivations and more. You can’t simply choose to do what people pay you to do.

So you come up with a proxy. You sum up all of these ideas, into a vague picture of what all those people want. You have some idea of scientific quality: not just a matter of doing science correctly and carefully, but doing interesting science. It’s not something you ever articulate. It’s likely even contradictory, after all, the goals it approximates often are. Nonetheless, it’s your guide, and not just your guide: it’s the guide of those who hire you, those who choose if you get promoted or whether you get more funding. All of these people have some vague idea in their head of what makes good science, their own proxy for the desires of the vast mass of voters and decision-makers and funders.

But of course, the standard is still vague. Should good science be deep? Which topics are deeper than others? Should it be practical? Practical for whom? Should it be surprising? What do you expect to happen, and what would surprise you? Should it get the community excited? Which community?

As a practicing scientist, you have to build your own proxy for these proxies. The same work that could get you hired in one place might meet blank stares at another, and you can’t build your life around those unpredictable quirks. So you make your own vague idea of what you’re supposed to do, an alchemy of what excites you and what makes an impact and what your friends are doing. You build a stand-in in your head, on the expectation that no-one else will have quite the same stand-in, then go out and convince the other stand-ins to give money to your version. You stand on a shifting pile of unwritten rules, subtler even than some artists, because at the end of the day there’s never a real client to be seen. Just another proxy.