The Niels Bohr Institute is hosting a conference this week on New Ideas in Cosmology. I’m no cosmologist, but it’s a pretty cool field, so as a local I’ve been sitting in on some of the talks. So far they’ve had a selection of really interesting speakers with quite a variety of interests, including a talk by Roger Penrose with his trademark hand-stippled drawings.
One thing that has impressed me has been the “interdisciplinary” feel of the conference. By all rights this should be one “discipline”, cosmology. But in practice, each speaker came at the subject from a different direction. They all had a shared core of knowledge, common models of the universe they all compare to. But the knowledge they brought to the subject varied: some had deep knowledge of the mathematics of gravity, others worked with string theory, or particle physics, or numerical simulations. Each talk, aware of the varied audience, was a bit “colloquium-style“, introducing a framework before diving in to the latest research. Each speaker knew enough to talk to the others, but not so much that they couldn’t learn from them. It’s been unexpectedly refreshing, a real interdisciplinary conference done right.
I’m at a conference this week of a very particular type: a birthday conference. When folks in my field turn 60, their students and friends organize a special conference for them, celebrating their research legacy. With COVID restrictions just loosening, my advisor Michael Douglas is getting a last-minute conference. And as one of the last couple students he graduated at Stony Brook, I naturally showed up.
The conference, Mikefest, is at the Institut des Hautes Études Scientifiques, just outside of Paris. Mike was a big supporter of the IHES, putting in a lot of fundraising work for them. Another big supporter, James Simons, was Mike’s employer for a little while after his time at Stony Brook. The conference center we’re meeting in is named for him.
I wasn’t involved in organizing the conference, so it was interesting seeing differences between this and other birthday conferences. Other conferences focus on the birthday prof’s “family tree”: their advisor, their students, and some of their postdocs. We’ve had several talks from Mike’s postdocs, and one from his advisor, but only one from a student. Including him and me, three of Mike’s students are here: another two have had their work mentioned but aren’t speaking or attending.
Most of the speakers have collaborated with Mike, but only for a few papers each. All of them emphasized a broader debt though, for discussions and inspiration outside of direct collaboration. The message, again and again, is that Mike’s work has been broad enough to touch a wide range of people. He’s worked on branes and the landscape of different string theory universes, pure mathematics and computation, neuroscience and recently even machine learning. The talks generally begin with a few anecdotes about Mike, before pivoting into research talks on the speakers’ recent work. The recent-ness of the work is perhaps another difference from some birthday conferences: as one speaker said, this wasn’t just a celebration of Mike’s past, but a “welcome back” after his return from the finance world.
One thing I don’t know is how much this conference might have been limited by coming together on short notice. For other birthday conferences impacted by COVID (and I’m thinking of one in particular), it might be nice to have enough time to have most of the birthday prof’s friends and “academic family” there in person. As-is, though, Mike seems to be having fun regardless.
Before this year’s prize was announced, I remember a few “water cooler chats” about who might win. No guess came close, though. The Nobel committee seems to have settled into a strategy of prizes on a loosely linked “basket” of topics, with half the prize going to a prominent theorist and the other half going to two experimental, observational, or (in this case) computational physicists. It’s still unclear why they’re doing this, but regardless it makes it hard to predict what they’ll do next!
When I read the announcement, my first reaction was, “surely it’s not that Parisi?” Giorgio Parisi is known in my field for the Altarelli-Parisi equations (more properly known as the DGLAP equations, the longer acronym because, as is often the case in physics, the Soviets got there first). These equations are in some sense why the scattering amplitudes I study are ever useful at all. I calculate collisions of individual fundamental particles, like quarks and gluons, but a real particle collider like the LHC collides protons. Protons are messy, interacting combinations of quarks and gluons. When they collide you need not merely the equations describing colliding quarks and gluons, but those that describe their messy dynamics inside the proton, and in particular how those dynamics look different for experiments with different energies. The equation that describes that is the DGLAP equation.
As it turns out, Parisi is known for a lot more than the DGLAP equation. He is best known for his work on “spin glasses”, models of materials where quantum spins try to line up with each other, never quite settling down. He also worked on a variety of other complex systems, including flocks of birds!
I don’t know as much about Manabe and Hasselmann’s work. I’ve only seen a few talks on the details of climate modeling. I’ve seen plenty of talks on other types of computer modeling, though, from people who model stars, galaxies, or black holes. And from those, I can appreciate what Manabe and Hasselmann did. Based on those talks, I recognize the importance of those first one-dimensional models, a single column of air, especially back in the 60’s when computer power was limited. Even more, I recognize how impressive it is for someone to stay on the forefront of that kind of field, upgrading models for forty years to stay relevant into the 2000’s, as Manabe did. Those talks also taught me about the challenge of coupling different scales: how small effects in churning fluids can add up and affect the simulation, and how hard it is to model different scales at once. To use these effects to discover which models are reliable, as Hasselmann did, is a major accomplishment.
Scientific programming was in the news lately, when doubts were raised about a coronavirus simulation by researchers at Imperial College London. While the doubts appear to have been put to rest, doing so involved digging through some seriously messy code. The whole situation seems to have gotten a lot of people worried. If these people are that bad at coding, why should we trust their science?
I don’t know much about coronavirus simulations, my knowledge there begins and ends with a talk I saw last month. But I know a thing or two about bad scientific code, because I write it. My code is atrocious. And I’ve seen published code that’s worse.
Why do scientists write bad code?
In part, it’s a matter of training. Some scientists have formal coding training, but most don’t. I took two CS courses in college and that was it. Despite that lack of training, we’re expected and encouraged to code. Before I took those courses, I spent a summer working in a particle physics lab, where I was expected to pick up the C++-based interface pretty much on the fly. I don’t think there’s another community out there that has as much reason to code as scientists do, and as little training for it.
Would it be useful for scientists to have more of the tools of a trained coder? Sometimes, yeah. Version control is a big one, I’ve collaborated on papers that used Git and papers that didn’t, and there’s a big difference. There are coding habits that would speed up our work and lead to fewer dead ends, and they’re worth picking up when we have the time.
But there’s a reason we don’t prioritize “proper coding”. It’s because the things we’re trying to do, from a coding perspective, are really easy.
What, code-wise, is a coronavirus simulation? A vector of “people”, really just simple labels, all randomly infecting each other and recovering, with a few parameters describing how likely they are to do so and how long it takes. What do I do, code-wise? Mostly, giant piles of linear algebra.
These are not some sort of cutting-edge programming tasks. These are things people have been able to do since the dawn of computers. These are things that, when you screw them up, become quite obvious quite quickly.
Compared to that, the everyday tasks of software developers, like making a reliable interface for users, or efficient graphics, are much more difficult. They’re tasks that really require good coding practices, that just can’t function without them.
For us, the important part is not the coding itself, but what we’re doing with it. Whatever bugs are in a coronavirus simulation, they will have much less impact than, for example, the way in which the simulation includes superspreaders. Bugs in my code give me obviously wrong answers, bad scientific assumptions are much harder for me to root out.
There’s an exception that proves the rule here, and it’s that, when the coding task is actually difficult, scientists step up and write better code. Scientists who want to run efficiently on supercomputers, who are afraid of numerical error or need to simulate on many scales at once, these people learn how to code properly. The code behind the LHC still might be jury-rigged by industry standards, but it’s light-years better than typical scientific code.
I get the furor around the Imperial group’s code. I get that, when a government makes a critical decision, you hope that their every input is as professional as possible. But without getting too political for this blog, let me just say that whatever your politics are, if any of it is based on science, it comes from code like this. Psychology studies, economic modeling, polling…they’re using code, and it’s jury-rigged to hell. Scientists just have more important things to worry about.
On one hand, the practical benefits of a 53-qubit computer are pretty minimal. Scott discusses some applications: you can generate random numbers, distributed in a way that will let others verify that they are truly random, the kind of thing it’s occasionally handy to do in cryptography. Still, by itself this won’t change the world, and compared to the quantum computing hype I can understand if people find this underwhelming.
Ok, I’m actually just re-phrasing what I said before. The Extended Church-Turing Thesis proposes that a classical computer (more specifically, a probabilistic Turing machine) can efficiently simulate any reasonable computation. Falsifying it means finding something that a classical computer cannot compute efficiently but another sort of computer (say, a quantum computer) can. If the calculation Google did truly can’t be done efficiently on a classical computer (this is not proven, though experts seem to expect it to be true) then yes, that’s what Google claims to have done.
So we get back to the real question: should we be impressed by quantum supremacy?
Well, should we have been impressed by the Higgs?
The detection of the Higgs boson in 2012 hasn’t led to any new Higgs-based technology. No-one expected it to. It did teach us something about the world: that the Higgs boson exists, and that it has a particular mass. I think most people accept that that’s important: that it’s worth knowing how the world works on a fundamental level.
Google may have detected the first-known violation of the Extended Church-Turing Thesis. This could eventually lead to some revolutionary technology. For now, though, it hasn’t. Instead, it teaches us something about the world.
It may not seem like it, at first. Unlike the Higgs boson, “Extended Church-Turing is false” isn’t a law of physics. Instead, it’s a fact about our capabilities. It’s a statement about the kinds of computers we can and cannot build, about the kinds of algorithms we can and cannot implement, the calculations we can and cannot do.
Facts about our capabilities are still facts about the world. They’re still worth knowing, for the same reasons that facts about the world are still worth knowing. They still give us a clearer picture of how the world works, which tells us in turn what we can and cannot do. According to the leaked paper, Google has taught us a new fact about the world, a deep fact about our capabilities. If that’s true we should be impressed, even without new technology.
There’s a picture we learn in high school. It’s not the whole story, certainly: philosophers of science have much more sophisticated notions. But for practicing scientists, it’s a picture that often sits in the back of our minds, informing what we do. Because of that, it’s worth examining in detail.
In the high school picture, scientific theories make predictions. Importantly, postdictions don’t count: if you “predict” something that already happened, it’s too easy to cheat and adjust your prediction. Also, your predictions must be different from those of other theories. If all you can do is explain the same results with different words you aren’t doing science, you’re doing “something else” (“metaphysics”, “religion”, “mathematics”…whatever the person you’re talking to wants to make fun of, but definitely not science).
Seems reasonable, right? Let’s try a thought experiment.
In the late 1950’s, the physics of protons and neutrons was still quite mysterious. They seemed to be part of a bewildering zoo of particles that no-one could properly explain. In the 60’s and 70’s the field started converging on the right explanation, from Gell-Mann’s eightfold way to the parton model to the full theory of quantum chromodynamics (QCD for short). Today we understand the theory well enough to package things into computer code: amplitudes programs like BlackHat for collisions of individual quarks, jet algorithms that describe how those quarks become signals in colliders, lattice QCD implemented on supercomputers for pretty much everything else.
Now imagine that you had a time machine, prodigious programming skills, and a grudge against 60’s era-physicists.
Suppose you wrote a computer program that combined the best of QCD in the modern world. BlackHat and more from the amplitudes side, the best jet algorithms and lattice QCD code, and more: a program that could reproduce any calculation in QCD that anyone can do today. Further, suppose you don’t care about silly things like making your code readable. Since I began the list above with BlackHat, we’ll call the combined box of different codes BlackBox.
Now suppose you went back in time, and told the bewildered scientists of the 50’s that nuclear physics was governed by a very complicated set of laws: the ones implemented in BlackBox.
Your “BlackBox theory” passes the high school test. Not only would it match all previous observations, it could make predictions for any experiment the scientists of the 50’s could devise. Up until the present day, your theory would match observations as well as…well as well as QCD does today.
(Let’s ignore for the moment that they didn’t have computers that could run this code in the 50’s. This is a thought experiment, we can fudge things a bit.)
Now suppose that one of those enterprising 60’s scientists, Gell-Mann or Feynman or the like, noticed a pattern. Maybe they got it from an experiment scattering electrons off of protons, maybe they saw it in BlackBox’s code. They notice that different parts of “BlackBox theory” run on related rules. Based on those rules, they suggest a deeper reality: protons are made of quarks!
But is this “quark theory” scientific?
“Quark theory” doesn’t make any new predictions. Anything you could predict with quarks, you could predict with BlackBox. According to the high school picture of science, for these 60’s scientists quarks wouldn’t be scientific: they would be “something else”, metaphysics or religion or mathematics.
And in practice? I doubt that many scientists would care.
“Quark theory” makes the same predictions as BlackBox theory, but I think most of us understand that it’s a better theory. It actually explains what’s going on. It takes different parts of BlackBox and unifies them into a simpler whole. And even without new predictions, that would be enough for the scientists in our thought experiment to accept it as science.
Why am I thinking about this? For two reasons:
First, I want to think about what happens when we get to a final theory, a “Theory of Everything”. It’s probably ridiculously arrogant to think we’re anywhere close to that yet, but nonetheless the question is on physicists’ minds more than it has been for most of history.
Right now, the Standard Model has many free parameters, numbers we can’t predict and must fix based on experiments. Suppose there are two options for a final theory: one that has a free parameter, and one that doesn’t. Once that one free parameter is fixed, both theories will match every test you could ever devise (they’re theories of everything, after all).
If we come up with both theories before testing that final parameter, then all is well. The theory with no free parameters will predict the result of that final experiment, the other theory won’t, so the theory without the extra parameter wins the high school test.
What if we do the experiment first, though?
If we do, then we’re in a strange situation. Our “prediction” of the one free parameter is now a “postdiction”. We’ve matched numbers, sure, but by the high school picture we aren’t doing science. Our theory, the same theory that was scientific if history went the other way, is now relegated to metaphysics/religion/mathematics.
I don’t know about you, but I’m uncomfortable with the idea that what is or is not science depends on historical chance. I don’t like the idea that we could be stuck with a theory that doesn’t explain everything, simply because our experimentalists were able to work a bit faster.
My second reason focuses on the here and now. You might think we have nothing like BlackBox on offer, no time travelers taunting us with poorly commented code. But we’ve always had the option of our own Black Box theory: experiment itself.
The Standard Model fixes some of its parameters from experimental results. You do a few experiments, and you can predict the results of all the others. But why stop there? Why not fix all of our parameters with experiments? Why not fix everything with experiments?
That’s the Black Box Theory of Everything. Each individual experiment you could possibly do gets its own parameter, describing the result of that experiment. You do the experiment, fix that parameter, then move on to the next experiment. Your theory will never be falsified, you will never be proven wrong. Sure, you never predict anything either, but that’s just an extreme case of what we have now, where the Standard Model can’t predict the mass of the Higgs.
What’s wrong with the Black Box Theory? (I trust we can all agree that it’s wrong.)
It’s not just that it can’t make predictions. You could make it a Black Box All But One Theory instead, that predicts one experiment and takes every other experiment as input. You could even make a Black Box Except the Standard Model Theory, that predicts everything we can predict now and just leaves out everything we’re still confused by.
The Black Box Theory is wrong because the high school picture of what counts as science is wrong. The high school picture is a useful guide, it’s a good rule of thumb, but it’s not the ultimate definition of science. And especially now, when we’re starting to ask questions about final theories and ultimate parameters, we can’t cling to the high school picture. We have to be willing to actually think, to listen to the philosophers and consider our own motivations, to figure out what, in the end, we actually mean by science.
When I join a new department or institute, the first thing I ask is “do we have a cluster?”
Most of what I do, I do on a computer. Gone are the days when theorists would always do all their work on notepads and chalkboards (though many still do!). Instead, we use specialized computer programs like Mathematica and Maple. Using a program helps keep us from forgetting pesky minus signs, and it allows working with equations far too long to fit on a sheet of paper.
Supercomputers are great, but they’re also expensive. The people who use supercomputers are the ones who model large, complicated systems, like the weather, or supernovae. For most theorists, you still want power, but you don’t need quite that much. That’s where computer clusters come in.
A computer cluster is pretty much what it sounds like: several computers wired together. Different clusters contain different numbers of computers. For example, my department has a ten-node cluster. Sure, that doesn’t stack up to a supercomputer, but it’s still ten times as fast as an ordinary computer, right?
The power of ten computers!
Well, not exactly. As several of my friends have been surprised to learn, the computers on our cluster are actually slower than most of our laptops.
The power of ten old computers!
Still, ten older computers is still faster than one new one, yes?
Even then, it depends how you use it.
Run a normal task on a cluster, and it’s just going to run on one of the computers, which, as I’ve said, are slower than a modern laptop. You need to get smarter.
There are two big advantages of clusters: time, and parallelization.
Sometimes, you want to do a calculation that will take a long time. Your computer is going to be busy for a day or two, and that’s inconvenient when you want to do…well, pretty much anything else. A cluster is a space to run those long calculations. You put the calculation on one of the nodes, you go back to doing your work, and you check back in a day or two to see if it’s finished.
Clusters are at their most powerful when you can parallelize. If you need to do ten versions of the same calculation, each slightly different, then rather than doing them one at a time a cluster lets you do them all at once. At that point, it really is making you ten times faster.
If you ever program, I’d encourage you to look into the resources you have available. A cluster is a very handy thing to have access to, no matter what you’re doing!