There isn’t a conference going on, but if you looked at the visitor list you’d be forgiven for thinking there was. We have talks in my subfield almost every day this week, two professors from my subfield here on sabbatical, and extra visitors on top of that.
The IAS is a bit of an odd place. Partly, that’s due to its physical isolation: tucked away in the woods behind Princeton, a half-hour’s walk from the nearest restaurant, it’s supposed to be a place for contemplation away from the hustle and bustle of the world.
Mostly, though, the weirdness of the IAS is due to the kind of institution it is.
Within a given country, most universities are pretty similar. Each may emphasize different teaching styles, and the US has a distinction between public and private, but (neglecting scammy for-profit universities), there are some commonalities of structure: both how they’re organized, and how they’re funded. Even between countries, different university systems have quite a bit of overlap.
The IAS, though, is not a university. It’s an independent institute. Neighboring Princeton supplies it with PhD students, but otherwise the IAS runs, and funds, itself.
There are a few other places like that around the world. The Perimeter Institute in Canada is also independent, and also borrows students from a neighboring university. CERN pools resources from several countries across Europe and beyond, Nordita from just the Nordic countries. Generalizing further, many countries have some sort of national labs or other nation-wide systems, from US Department of Energy labs like SLAC to Germany’s Max Planck Institutes.
And while universities share a lot in common, non-university institutes can be very different. Some are closely tied to a university, located inside university buildings with members with university affiliations. Others sit at a greater remove, less linked to a university or not linked at all. Some have their own funding, investments or endowments or donations, while others are mostly funded by governments, or groups of governments. I’ve heard that the IAS gets about 10% of its budget from the government, while Perimeter gets its everyday operating expenses entirely from the Canadian government and uses donations for infrastructure and the like.
So ultimately, the IAS is weird because every organization like it is weird. There are a few templates, and systems, but by and large each independent research organization is different. Understanding one doesn’t necessarily help at understanding another.
And so every day, I check the arXiv. I go to the section on my sub-field, and I click on a link that lists all of the papers that were new that day. I skim the titles, and if I see an interesting paper I’ll read the abstract, and maybe download the full thing. Checking as I’m writing this, there were ten papers posted in my field, and another twenty “cross-lists” were posted in other fields but additionally classified in mine.
Other fields use arXiv: mathematicians and computer scientists and even economists use it in roughly the same way physicists do. For biology and medicine, though, there are different, newer sites: bioRxiv and medRxiv.
One thing you may notice is the different capitalization. When physicists write arXiv, the “X” is capitalized. In the logo, it looks like a Greek letter chi, thus saying “archive”. The biologists and medical researchers capitalize the R instead. The logo still has an X that looks like a chi, but positioned with the R it looks like the Rx of medical prescriptions.
Something I noticed, but you might not, was the lack of a handy link to see new papers. You can search medRxiv and bioRxiv, and filter by date. But there’s no link that directly takes you to the newest papers. That suggests that biologists aren’t using bioRxiv like we use arXiv, and checking the new papers every day.
I was curious if this had to do with the scale of the field. I have the impression that physics and mathematics are smaller fields than biology, and that much less physics and mathematics research goes on than medical research. Certainly, theoretical particle physics is a small field. So I might have expected arXiv to be smaller than bioRxiv and medRxiv, and I certainly would expect fewer papers in my sub-field than papers in a medium-sized subfield of biology.
On the other hand, arXiv in my field is universal. In biology, bioRxiv and medRxiv are still quite controversial. More and more people are using them, but not every journal accepts papers posted to a preprint server. Many people still don’t use these services. So I might have expected bioRxiv and medRxiv to be smaller.
Checking now, neither answer is quite right. I looked between November 1 and November 2, and asked each site how many papers were uploaded between those dates. arXiv had the most, 604 papers. bioRxiv had roughly half that many, 348. medRxiv had 97.
arXiv represents multiple fields, bioRxiv is “just” biology. Specializing, on that day arXiv had 235 physics papers, 135 mathematics papers, and 250 computer science papers. So each individual field has fewer papers than biology in this period.
Specializing even further, I can look at a subfield. My subfield, which is fairly small, had 20 papers between those dates. Cell biology, which I would expect to be quite a big subfield, had 33.
Overall, the numbers were weirdly comparable, with medRxiv unexpectedly small compared to both arXiv and bioRxiv. I’m not sure whether there are more biologists than physicists, but I’m pretty sure there should be more cell biologists than theoretical particle physicists. This suggests that many still aren’t using bioRxiv. It makes me wonder: will bioRxiv grow dramatically in future? Are the people running it ready for if it does?
That makes PhD student unions common, but not the majority. It means they’re not unheard of and strange, but a typical university still isn’t unionized. It’s the sweet spot for controversy. It leads to a lot of dumb tweets.
(I won’t link to the tweet, in part because this person is probably being harassed enough already.)
I don’t know how things work in this professor’s field. But the implication, that professors primarily take on PhD students because they’re cheaper, not only doesn’t match my experience: it also just doesn’t make very much sense.
Imagine a neighborhood where the children form a union. They decide to demand a higher allowance, and to persuade any new children in the neighborhood to follow their lead.
Now imagine a couple in that neighborhood, deciding whether to have a child. Do you think that they might look at the fees the “children’s union” charges, and decide to hire an adult to do their chores instead?
Maybe there’s a price where they’d do that. If neighborhood children demanded thousands of dollars in allowance, maybe the young couple would decide that it’s too expensive to have a child. But a small shift is unlikely to change things very much: people have kids for many reasons, and those reasons don’t usually include cheap labor.
The reasons professors take on PhD students are similar to the reasons parents decide to have children. Some people have children because they want a legacy, something of theirs that survives to the next generation. For professors, PhD students are our legacy, our chance to raise someone on our ideas and see how they build on them. Some people have children because they love the act of child-raising: helping someone grow and learn about the world. The professors who take on students like taking on students: teaching is fun, after all.
That doesn’t mean there won’t be cases “on the margin”, where a professor finds they can’t afford a student they previously could. (And to be fair to the tweet I’m criticizing, they did even use the word “marginal”.) But they would have to be in a very tight funding situation, with very little flexibility.
And even for situations like that, long-term, I’m not sure anything would change.
I did my PhD in the US. I was part of a union, and in part because of that (though mostly because I was in a physics department), I was paid relatively decently for a PhD student. Relatively decently is still not that great, though. This was the US, where universities still maintain the fiction that PhD students only work 20 hours a week and pay proportionate to that, and where salaries in a university can change dramatically from student to postdoc to professor.
One thing I learned during my PhD is that despite our low-ish salaries, we cost our professors about as much as postdocs did. The reason why is tuition: PhD students don’t pay their own tuition, but that tuition still exists, and is paid by the professors who hire those students out of their grants. A PhD salary plus a PhD tuition ended up roughly equal to a postdoc salary.
Now, I’m working in a very different system. In a Danish university, wages are very flat. As a postdoc, a nice EU grant put me at almost the same salary as the professors. As a professor, my salary is pretty close to that of one of the better-paying schoolteacher jobs.
At the same time, tuition is much less relevant. Undergraduates don’t pay tuition at all, so PhD tuition isn’t based on theirs. Instead, it’s meant to cover costs of the PhD program as a whole.
I’ve filled out grants here in Denmark, so I know how much PhD students cost, and how much postdocs cost. And since the situation is so different, you might expect a difference here too.
There isn’t one. Hiring a PhD student, salary plus tuition, costs about as much as hiring a postdoc.
Two very different systems, with what seem to be very different rules, end up with the same equation. PhD students and postdocs cost about as much as each other, even if every assumption that you think would affect the outcome turns out completely different.
This is why I expect that, even if PhD students get paid substantially more, they still won’t end up that out of whack with postdocs. There appears to be an iron law of academic administration keeping these two numbers in line, one that holds across nations and cultures and systems. The proportion of unionized PhD students in the US will keep working its way upwards, and I don’t expect it to have any effect on whether professors take on PhDs.
As part of the pedagogy course I’ve been taking, I’m doing a few guest lectures in various courses. I’ve got one coming up in a classical mechanics course (“intermediate”-level, so not Newton’s laws, but stuff the general public doesn’t know much about like Hamiltonians). They’ve been speeding through the core content, so I got to cover a “fun” topic, and after thinking back to my grad school days I chose a topic I think they’ll have a lot of fun with: Chaos theory.
Chaos is one of those things everyone has a vague idea about. People have heard stories where a butterfly flaps its wings and causes a hurricane. Maybe they’ve heard of the rough concept, determinism with strong dependence on the initial conditions, so a tiny change (like that butterfly) can have huge consequences. Maybe they’ve seen pictures of fractals, and got the idea these are somehow related.
Its role in physics is a bit more detailed. It’s one of those concepts that “intermediate classical mechanics” is good for, one that can be much better understood once you’ve been introduced to some of the nineteenth century’s mathematical tools. It felt like a good way to show this class that the things they’ve learned aren’t just useful for dusty old problems, but for understanding something the public thinks is sexy and mysterious.
On the one hand, there’s a big fashion right now for something called research-based teaching. That doesn’t mean “using teaching methods that are justified by research” (though you’re supposed to do that too), but rather, “tying your teaching to current scientific research”. This is a fashion that makes sense, because learning about cutting-edge research in an undergraduate classroom feels pretty cool. It lets students feel more connected with the scientific community, it inspires them to get involved, and it gets them more used to what “real research” looks like.
On the other hand, structuring your textbook based on the original research papers feels kind of lazy. There’s a reason we don’t teach Newtonian mechanics the way Newton would have. Pedagogy is supposed to be something we improve at over time: we come up with better examples and better notation, more focused explanations that teach what we want students to learn. If we just summarize a paper, we’re not really providing “added value”: we should hope, at this point, that we can do better.
Thinking about this, I think the distinction boils down to why you’re teaching the material in the first place.
With a lot of research-based teaching, the goal is to show the students how to interact with current literature. You want to show them journal papers, not because the papers are the best way to teach a concept or skill, but because reading those papers is one of the skills you want to teach.
That makes sense for very current topics, but it seems a bit weird for the example I’ve been looking at, an early study of chaos from the 60’s. It’s great if students can read current papers, but they don’t necessarily need to read older ones. (At least, not yet.)
What then, is the textbook trying to teach? Here things get a bit messy. For a relatively old topic, you’d ideally want to teach not just a vague impression of what was discovered, but concrete skills. Here though, those skills are just a bit beyond the students’ reach: chaos is more approachable than you’d think, but still not 100% something the students can work with. Instead they’re learning to appreciate concepts. This can be quite valuable, but it doesn’t give the kind of structure that a concrete skill does. In particular, it makes it hard to know what to emphasize, beyond just summarizing the original article.
In this case, I’ve come up with my own way forward. There are actually concrete skills I’d like to teach. They’re skills that link up with what the textbook is teaching, skills grounded in the concepts it’s trying to convey, and that makes me think I can convey them. It will give some structure to the lesson, a focus on not merely what I’d like the students to think but what I’d like them to do.
I won’t go into too much detail: I suspect some of the students may be reading this, and I don’t want to spoil the surprise! But I’m looking forward to class, and to getting to try another pedagogical experiment.
There’s a saying in physics, attributed to the famous genius John von Neumann: “With four parameters I can fit an elephant, and with five I can make him wiggle his trunk.”
Say you want to model something, like some surprising data from a particle collider. You start with some free parameters: numbers in your model that aren’t decided yet. You then decide those numbers, “fixing” them based on the data you want to model. Your goal is for your model not only to match the data, but to predict something you haven’t yet measured. Then you can go out and check, and see if your model works.
The more free parameters you have in your model, the easier this can go wrong. More free parameters make it easier to fit your data, but that’s because they make it easier to fit any data. Your model ends up not just matching the physics, but matching the mistakes as well: the small errors that crop up in any experiment. A model like that may look like it’s a great fit to the data, but its predictions will almost all be wrong. It wasn’t just fit, it was overfit.
So, did you know machine learning was just modeling data?
All of the much-hyped recent advances in artificial intelligence, GPT and Stable Diffusion and all those folks, at heart they’re all doing this kind of thing. They start out with a model (with a lot more than five parameters, arranged in complicated layers…), then use data to fix the free parameters. Unlike most of the models physicists use, they can’t perfectly fix these numbers: there are too many of them, so they have to approximate. They then test their model on new data, and hope it still works.
Increasingly, it does, and impressively well, so well that the average person probably doesn’t realize this is what it’s doing. When you ask one of these AIs to make an image for you, what you’re doing is asking what image the model predicts would show up captioned with your text. It’s the same sort of thing as asking an economist what their model predicts the unemployment rate will be when inflation goes up. The machine learning model is just way, way more complicated.
As a physicist, the first time I heard about this, I had von Neumann’s quote in the back of my head. Yes, these machines are dealing with a lot more data, from a much more complicated reality. They literally are trying to fit elephants, even elephants wiggling their trunks. Still, the sheer number of parameters seemed fishy here. And for a little bit things seemed even more fishy, when I learned about double descent.
Suppose you start increasing the number of parameters in your model. Initially, your model gets better and better. Your predictions have less and less error, your error descends. Eventually, though, the error increases again: you have too many parameters so you’re over-fitting, and your model is capturing accidents in your data, not reality.
In machine learning, weirdly, this is often not the end of the story. Sometimes, your prediction error rises, only to fall once more, in a double descent.
For a while, I found this deeply disturbing. The idea that you can fit your data, start overfitting, and then keep overfitting, and somehow end up safe in the end, was terrifying. The way some of the popular accounts described it, like you were just overfitting more and more and that was fine, was baffling, especially when they seemed to predict that you could keep adding parameters, keep fitting tinier and tinier fleas on the elephant’s trunk, and your predictions would never start going wrong. It would be the death of Occam’s Razor as we know it, more complicated explanations beating simpler ones off to infinity.
Luckily, that’s not what happens. And after talking to a bunch of people, I think I finally understand this enough to say something about it here.
The right way to think about double descent is as overfitting prematurely. You do still expect your error to eventually go up: your model won’t be perfect forever, at some point you will really overfit. It might take a long time, though: machine learning people are trying to model very complicated things, like human behavior, with giant piles of data, so very complicated models may often be entirely appropriate. In the meantime, due to a bad choice of model, you can accidentally overfit early. You will eventually overcome this, pushing past with more parameters into a model that works again, but for a little while you might convince yourself, wrongly, that you have nothing more to learn.
So Occam’s Razor still holds, but with a twist. The best model is simple enough, but no simpler. And if you’re not careful enough, you can convince yourself that a too-simple model is as complicated as you can get.
I was reminded of all this recently by somearticles by Sabine Hossenfelder.
Hossenfelder is a critic of mainstream fundamental physics. The articles were her restating a point she’s made many times before, including in (at least) one of her books. She thinks the people who propose new particles and try to search for them are wasting time, and the experiments motivated by those particles are wasting money. She’s motivated by something like Occam’s Razor, the need to stick to the simplest possible model that fits the evidence. In her view, the simplest models are those in which we don’t detect any more new particles any time soon, so those are the models she thinks we should stick with.
I tend to disagree with Hossenfelder. Here, I was oddly conflicted. In some of her examples, it seemed like she had a legitimate point. Others seemed like she missed the mark entirely.
Talk to most astrophysicists, and they’ll tell you dark matter is settled science. Indeed, there is a huge amount of evidence that something exists out there in the universe that we can’t see. It distorts the way galaxies rotate, lenses light with its gravity, and wiggled the early universe in pretty much the way you’d expect matter to.
What isn’t settled is whether that “something” interacts with anything else. It has to interact with gravity, of course, but everything else is in some sense “optional”. Astroparticle physicists use satellites to search for clues that dark matter has some other interactions: perhaps it is unstable, sometimes releasing tiny signals of light. If it did, it might solve other problems as well.
Hossenfelder thinks this is bunk (in part because she thinks those other problems are bunk). I kind of do too, though perhaps for a more general reason: I don’t think nature owes us an easy explanation. Dark matter isn’t obligated to solve any of our other problems, it just has to be dark matter. That seems in some sense like the simplest explanation, the one demanded by Occam’s Razor.
At the same time, I disagree with her substantially more on collider physics. At the Large Hadron Collider so far, all of the data is reasonably compatible with the Standard Model, our roughly half-century old theory of particle physics. Collider physicists search that data for subtle deviations, one of which might point to a general discrepancy, a hint of something beyond the Standard Model.
While my intuitions say that the simplest dark matter is completely dark, they don’t say that the simplest particle physics is the Standard Model. Back when the Standard Model was proposed, people might have said it was exceptionally simple because it had a property called “renormalizability”, but these days we view that as less important. Physicists like Ken Wilson and Steven Weinberg taught us to view theories as a kind of series of corrections, like a Taylor series in calculus. Each correction encodes new, rarer ways that particles can interact. A renormalizable theory is just the first term in this series. The higher terms might be zero, but they might not. We even know that some terms cannot be zero, because gravity is not renormalizable.
The two cases on the surface don’t seem that different. Dark matter might have zero interactions besides gravity, but it might have other interactions. The Standard Model might have zero corrections, but it might have nonzero corrections. But for some reason, my intuition treats the two differently: I would find it completely reasonable for dark matter to have no extra interactions, but very strange for the Standard Model to have no corrections.
I think part of where my intuition comes from here is my experience with other theories.
One example is a toy model called sine-Gordon theory. In sine-Gordon theory, this Taylor series of corrections is a very familiar Taylor series: the sine function! If you go correction by correction, you’ll see new interactions and more new interactions. But if you actually add them all up, something surprising happens. Sine-Gordon turns out to be a special theory, one with “no particle production”: unlike in normal particle physics, in sine-Gordon particles can neither be created nor destroyed. You would never know this if you did not add up all of the corrections.
String theory itself is another example. In string theory, elementary particles are replaced by strings, but you can think of that stringy behavior as a series of corrections on top of ordinary particles. Once again, you can try adding these things up correction by correction, but once again the “magic” doesn’t happen until the end. Only in the full series does string theory “do its thing”, and fix some of the big problems of quantum gravity.
If the real world really is a theory like this, then I think we have to worry about something like double descent.
Remember, double descent happens when our models can prematurely get worse before getting better. This can happen if the real thing we’re trying to model is very different from the model we’re using, like the example in this explainer that tries to use straight lines to match a curve. If we think a model is simpler because it puts fewer corrections on top of the Standard Model, then we may end up rejecting a reality with infinite corrections, a Taylor series that happens to add up to something quite nice. Occam’s Razor stops helping us if we can’t tell which models are really the simple ones.
The problem here is that every notion of “simple” we can appeal to here is aesthetic, a choice based on what makes the math look nicer. Other sciences don’t have this problem. When a biologist or a chemist wants to look for the simplest model, they look for a model with fewer organisms, fewer reactions…in the end, fewer atoms and molecules, fewer of the building-blocks given to those fields by physics. Fundamental physics can’t do this: we build our theories up from mathematics, and mathematics only demands that we be consistent. We can call theories simpler because we can write them in a simple way (but we could write them in a different way too). Or we can call them simpler because they look more like toy models we’ve worked with before (but those toy models are just a tiny sample of all the theories that are possible). We don’t have a standard of simplicity that is actually reliable.
There is one other way out of this pickle. A theory that is easier to write down is under no obligation to be true. But it is more likely to be useful. Even if the real world is ultimately described by some giant pile of mathematical parameters, if a simple theory is good enough for the engineers then it’s a better theory to aim for: a useful theory that makes peoples’ lives better.
I kind of get the feeling Hossenfelder would make this objection. I’ve seen her argue on twitter that scientists should always be able to say what their research is good for, and her Guardian article has this suggestive sentence: “However, we do not know that dark matter is indeed made of particles; and even if it is, to explain astrophysical observations one does not need to know details of the particles’ behaviour.”
Ok yes, to explain astrophysical observations one doesn’t need to know the details of dark matter particles’ behavior. But taking a step back, one doesn’t actually need to explain astrophysical observations at all.
Astrophysics and particle physics are not engineering problems. Nobody out there is trying to steer a spacecraft all the way across a galaxy, navigating the distribution of dark matter, or creating new universes and trying to make sure they go just right. Even if we might do these things some day, it will be so far in the future that our attempts to understand them won’t just be quaint: they will likely be actively damaging, confusing old research in dead languages that the field will be better off ignoring to start from scratch.
Because of that, usefulness is also not a meaningful guide. It cannot tell you which theories are more simple, which to favor with Occam’s Razor.
Hossenfelder’s highest-profile recent work falls afoul of one or the other of her principles. Her work on the foundations of quantum mechanics could genuinely be useful, but there’s no reason aside from claims of philosophical beauty to expect it to be true. Her work on modeling dark matter is at least directly motivated by data, but is guaranteed to not be useful.
I’m not pointing this out to call Hossenfelder a hypocrite, as some sort of ad hominem or tu quoque. I’m pointing this out because I don’t think it’s possible to do fundamental physics today without falling afoul of these principles. If you want to hold out hope that your work is useful, you don’t have a great reason besides a love of pretty math: otherwise, anything useful would have been discovered long ago. If you just try to model existing data as best you can, then you’re making a model for events far away or locked in high-energy particle colliders, a model no-one else besides other physicists will ever use.
I don’t know the way through this. I think if you need to take Occam’s Razor seriously, to build on the same foundations that work in every other scientific field…then you should stop doing fundamental physics. You won’t be able to make it work. If you still need to do it, if you can’t give up the sub-field, then you should justify it on building capabilities, on the kind of “practice” Hossenfelder also dismisses in her Guardian piece.
We don’t have a solid foundation, a reliable notion of what is simple and what isn’t. We have guesses and personal opinions. And until some experiment uncovers some blinding flash of new useful meaningful magic…I don’t think we can do any better than that.
You can think of a quantum particle like a coin frozen in mid-air. Once measured, the coin falls, and you read it as heads or tails, but before then the coin is neither, with equal chance to be one or the other. In this metaphor, quantum entanglement slices the coin in half. Slice a coin in half on a table, and its halves will either both show heads, or both tails. Slice our “frozen coin” in mid-air, and it keeps this property: the halves, both still “frozen”, can later be measured as both heads, or both tails. Even if you separate them, the outcomes never become independent: you will never find one half-coin to land on tails, and the other on heads.
Einstein thought that this couldn’t be the whole story. He was bothered by the way that measuring a “frozen” coin seems to change its behavior faster than light, screwing up his theory of special relativity. Entanglement, with its ability to separate halves of a coin as far as you liked, just made the problem worse. He thought that there must be a deeper theory, one with “hidden variables” that determined whether the halves would be heads or tails before they were separated.
Bell’s inequalities were just theory, though, until this year’s Nobelists arrived to test them. Clauser was first: in the 70’s, he proposed a variant of Bell’s inequalities, then tested them by measuring members of a pair of entangled photons in two different places. He found complete agreement with quantum mechanics.
Still, there was a loophole left for Einstein’s idea. If the settings on the two measurement devices could influence the pair of photons when they were first entangled, that would allow hidden variables to influence the outcome in a way that avoided Bell and Clauser’s calculations. It was Aspect, in the 80’s, who closed this loophole: by doing experiments fast enough to change the measurement settings after the photons were entangled, he could show that the settings could not possibly influence the forming of the entangled pair.
Aspect’s experiments, in many minds, were the end of the story. They were the ones emphasized in the textbooks when I studied quantum mechanics in school.
The remaining loopholes are trickier. Some hope for a way to correlate the behavior of particles and measurement devices that doesn’t run afoul of Aspect’s experiment. This idea, called, superdeterminism, has recently had a fewpassionateadvocates, but speaking personally I’m still confused as to how it’s supposed to work. Others want to jettison special relativity altogether. This would not only involve measurements influencing each other faster than light, but also would break a kind of symmetry present in the experiments, because it would declare one measurement or the other to have happened “first”, something special relativity forbids. The majority, uncomfortable with either approach, thinks that quantum mechanics is complete, with no deterministic theory that can replace it. They differ only on how to describe, or interpret, the theory, a debate more the domain of careful philosophy than of physics.
After all of these philosophical debates over the nature of reality, you may ask what quantum entanglement can do for you?
I’ve done a lot of work with what we like to call “bootstrap” methods. Instead of doing a particle physics calculation in all its gory detail, we start with a plausible guess and impose requirements based on what we know. Eventually, we have the right answer pulled up “by its own bootstraps”: the only answer the calculation could have, without actually doing the calculation.
This method works very well, but so far it’s only been applied to certain kinds of calculations, involving mathematical functions called polylogarithms. More complicated calculations involve a mathematical object called an elliptic curve, and until very recently it wasn’t clear how to bootstrap them. To get people thinking about it, my colleagues Hjalte Frellesvig and Andrew McLeod asked the Carlsberg Foundation (yes, that Carlsberg) to fund a mini-conference. The idea was to get elliptic people and bootstrap people together (along with Hjalte’s tribe, intersection theory people) to hash things out. “Jumpstart people” are not a thing in physics, so despite the title they were not invited.
Having the conference so soon after the yearly Elliptics meeting had some strange consequences. There was only one actual duplicate talk, but the first day of talks all felt like they would have been welcome additions to the earlier conference. Some might be functioning as “overflow”: Elliptics this year focused on discussion and so didn’t have many slots for talks, while this conference despite its discussion-focused goal had a more packed schedule. In other cases, people might have been persuaded by the more relaxed atmosphere and lack of recording or posted slides to give more speculative talks. Oliver Schlotterer’s talk was likely in this category, a discussion of the genus-two functions one step beyond elliptics that I think people at the previous conference would have found very exciting, but which involved work in progress that I could understand him being cautious about presenting.
The other days focused more on the bootstrap side, with progress on some surprising but not-quite-yet elliptic avenues. It was great to hear that Mark Spradlin is making new progress on his Ziggurat story, to hear James Drummond suggest a picture for cluster algebras that could generalize to other theories, and to get some idea of the mysterious ongoing story that animates my colleague Cristian Vergu.
There was one thing the organizers couldn’t have anticipated that ended up throwing the conference into a new light. The goal of the conference was to get people started bootstrapping elliptic functions, but in the meantime people have gotten started on their own. Roger Morales Espasa presented his work on this with several of my other colleagues. They can already reproduce a known result, the ten-particle elliptic double-box, and are well on-track to deriving something genuinely new, the twelve-particle version. It’s exciting, but it definitely makes the rest of us look around and take stock. Hopefully for the better!
I had a paper two weeks ago with a Master’s student, Alex Chaparro Pozo. I haven’t gotten a chance to talk about it yet, so I thought I should say a few words this week. It’s another entry in what I’ve been calling my cabinet of curiosities, interesting mathematical “objects” I’m sharing with the world.
I calculate scattering amplitudes, formulas that give the probability that particles scatter off each other in particular ways. While in principle I could do this with any particle physics theory, I have a favorite: a “toy model” called N=4 super Yang-Mills. N=4 super Yang-Mills doesn’t describe reality, but it lets us figure out cool new calculation tricks, and these often end up useful in reality as well.
Many scattering amplitudes in N=4 super Yang-Mills involve a type of mathematical functions called polylogarithms. These functions are especially easy to work with, but they aren’t the whole story. One we start considering more complicated situations (what if two particles collide, and eight particles come out?) we need more complicated functions, called elliptic polylogarithms.
The original calculation was pretty complicated. Two particles colliding, eight particles coming out, meant that in total we had to keep track of ten different particles. That gets messy fast. I’m pretty good at dealing with six particles, not ten. Luckily, it turned out there was a way to pretend there were six particles only: by “twisting” up the calculation, we found a toy model within the toy model: a six-particle version of the calculation. Much like the original was in a theory that doesn’t describe the real world, these six particles don’t describe six particles in that theory: they’re a kind of toy calculation within the toy model, doubly un-real.
With this nested toy model, I was confident we could do the calculation. I wasn’t confident I’d have time for it, though. This ended up making it perfect for a Master’s thesis, which is how Alex got into the game.
Alex worked his way through the calculation, programming and transforming, going from one type of mathematical functions to another (at least once because I’d forgotten to tell him the right functions to use, oops!) There were more details and subtleties than expected, but in the end everything worked out.
Alex left the field (not, as far as I know, because of this). And for a while, because of that especially thorough scooping, I didn’t publish.
What changed my mind, in part, was seeing the field develop in the meantime. It turns out toy models, and even nested toy models, are quite useful. We still have a lot ofuncertainty about what to do, how to use the new calculation methods and what they imply. And usually, the best way to get through that kind of uncertainty is with simple, well-behaved toy models.
So I thought, in the end, that this might be useful. Even if it’s a toy version of something that already exists, I expect it to be an educational toy, one we can learn a lot from. So I’ve put it out into the world, as part of this year’s cabinet of curiosities.
Elliptics has been growing in recent years, hurtling into prominence as a subfield of amplitudes (which is already a subfield of theoretical physics). This has led to growing lists of participants and a more and more packed schedule.
This year walked all of that back a bit. There were three talks a day: two one-hour talks by senior researchers and one half-hour talk by a junior researcher. The rest, as well as the whole last day, are geared to discussion. It’s an attempt to go back to the subfield’s roots. In the beginning, the Elliptics conferences drew together a small group to sort out a plan for the future, digging through the often-confusing mathematics to try to find a baseline for future progress. The field has advanced since then, but some of our questions are still almost as basic. What relations exist between different calculations? How much do we value fast numerics, versus analytical understanding? What methods do we want to preserve, and which aren’t serving us well? To answer these questions, it helps to get a few people together in one place, not to silently listen to lectures, but to question and discuss and hash things out. I may have heard a smaller range of topics at this year’s Elliptics, but due to the sheer depth we managed to probe on those fewer topics I feel like I’ve learned much more.
Since someone always asks, I should say that the talks were not recorded, but they are posting slides online, so if you’re interested in the topic you can watch there. A few people discussed new developments, some just published and some yet to be published. I discussed the work I talked about last week, and got a lot of good feedback and ideas about how to move forward.