Tag Archives: supercomputers

Bottlenecks, Known and Unknown

Scientists want to know everything, and we’ve been trying to get there since the dawn of science. So why aren’t we there yet? Why are there things we still don’t know?

Sometimes, the reason is obvious: we can’t do the experiments yet. Victorian London had neither the technology nor the wealth to build a machine like Fermilab, so they couldn’t discover the top quark. Even if Newton had the idea for General Relativity, the telescopes of the era wouldn’t have let astronomers see its effect on the motion of Mercury. As we grow (in technology, in resources, in knowledge, in raw number of human beings), we can test more things and learn more about the world.

But I’m a theoretical physicist, not an experimental physicist. I still want to understand the world, but what I contribute aren’t new experiments, but new ideas and new calculations. This brings back the question in a new form: why are there calculations we haven’t done yet? Why are there ideas we haven’t had yet?

Sometimes, we can track the reason down to bottlenecks. A bottleneck is a step in a calculation that, for some reason, is harder than the rest. As you try to push a calculation to new heights, the bottleneck is the first thing that slows you down, like the way liquid bubbles through the neck of a literal bottle. If you can clear the bottleneck, you can speed up your calculation and accomplish more.

In the clearest cases, we can see how these bottlenecks could be solved with more technology. As computers get faster and more powerful, calculations become possible that weren’t possible before, in the same way new experiments become possible with new equipment. This is essentially what has happened recently with machine learning, where relatively old ideas are finally feasible to apply on a massive scale.

In physics, a subtlety is that we rarely have access to the most powerful computers available. Some types of physics are done on genuine supercomputers, but for more speculative or lower-priority research we have to use small computer clusters, or even our laptops. Something can be a bottleneck not because it can’t be done on any computer, but because it can’t be done on the computers we can afford.

Most of the time, bottlenecks aren’t quite so obvious. That’s because in theoretical physics, often, we don’t know what we want to calculate. If we want to know why something happens, and not merely that it happens, then we need a calculation that we can interpret, that “makes sense” and that thus, hopefully, we can generalize. We might have some ideas for how that calculation could work: some property a mathematical theory might have that we already know how to understand. Some of those ideas are easy to check, so we check, and make progress. Others are harder, and we have to decide: is the calculation worth it, if we don’t know if it will give us the explanation we need?

Those decisions provide new bottlenecks, often hidden ones. As we get better at calculation, the threshold for an “easy” check gets easier and easier to meet. We put aside fewer possibilities, so we notice more things, which inspire yet more ideas. We make more progress, not because the old calculations were impossible, but because they weren’t easy enough, and now they are. Progress fuels progress, a virtuous cycle that gets us closer and closer to understanding everything we want to understand (which is everything).

When Your Research Is a Cool Toy

Merry Newtonmas, everyone!

In the US, PhD students start without an advisor. As they finish their courses, different research groups make their pitch, trying to get them to join. Some promise interesting puzzles and engaging mysteries, others talk about the importance of their work, how it can help society or understand the universe.

Thinking back to my PhD, there is one pitch I remember to this day. The pitch was from the computational astrophysics group, and the message was a simple one: “we blow up stars”.

Obviously, these guys didn’t literally blow up stars: they simulated supernovas. They weren’t trying to make some weird metaphysical argument, they didn’t believe their simulation was somehow the real thing. The point they were making, instead, was emotional: blowing up stars feels cool.

Scientists can be motivated by curiosity, fame, or altruism, and these are familiar things. But an equally important motivation is a sense of play. If your job is to build tiny cars for rats, some of your motivation has to be the sheer joy of building tiny cars for rats. If you simulate supernovas, then part of your motivation can be the same as my nephew hurling stuffed animals down the stairs: that joyful moment when you yell “kaboom!”

Probably, your motivation shouldn’t just be to play with a cool toy. You need some of those “serious” scientific motivations as well. But for those of you blessed with a job where you get to say “kaboom”, you have that extra powerful reason to get up in the morning. And for those of you just starting a scientific career, may you have some cool toys under your Newtonmas tree!

This Week at Quanta Magazine

I’ve got an article in Quanta Magazine this week, about a program called FORM.

Quanta has come up a number of times on this blog, they’re a science news outlet set up by the Simons Foundation. Their goal is to enhance the public understanding of science and mathematics. They cover topics other outlets might find too challenging, and they cover the topics others cover with more depth. Most people I know who’ve worked with them have been impressed by their thoroughness: they take fact-checking to a level I haven’t seen with other science journalists. If you’re doing a certain kind of mathematical work, then you hope that Quanta decides to cover it.

A while back, as I was chatting with one of their journalists, I had a startling realization: if I want Quanta to cover something, I can send them a tip, and if they’re interested they’ll write about it. That realization resulted in the article I talked about here. Chatting with the journalist interviewing me for that article, though, I learned something if anything even more startling: if I want Quanta to cover something, and I want to write about it, I can pitch the article to Quanta, and if they’re interested they’ll pay me to write about it.

Around the same time, I happened to talk to a few people in my field, who had a problem they thought Quanta should cover. A software, called FORM, was used in all the most serious collider physics calculations. Despite that, the software wasn’t being supported: its future was unclear. You can read the article to learn more.

One thing I didn’t mention in that article: I hadn’t used FORM before I started writing it. I don’t do those “most serious collider physics calculations”, so I’d never bothered to learn FORM. I mostly use Mathematica, a common choice among physicists who want something easy to learn, even if it’s not the strongest option for many things.

(By the way, it was surprisingly hard to find quotes about FORM that didn’t compare it specifically to Mathematica. In the end I think I included one, but believe me, there could have been a lot more.)

Now, I wonder if I should have been using FORM all along. Many times I’ve pushed to the limits of what Mathematica could comfortable handle, the limits of what my computer’s memory could hold, equations long enough that just expanding them out took complicated work-arounds. If I had learned FORM, maybe I would have breezed through those calculations, and pushed even further.

I’d love it if this article gets FORM more attention, and more support. But also, I’d love it if it gives a window on the nuts and bolts of hard-core particle physics: the things people have to do to turn those T-shirt equations into predictions for actual colliders. It’s a world in between physics and computer science and mathematics, a big part of the infrastructure of how we know what we know that, precisely because it’s infrastructure, often ends up falling through the cracks.

Edit: For researchers interested in learning more about FORM, the workshop I mentioned at the end of the article is now online, with registrations open.

At Amplitudes 2022 in Prague

It’s that time of year again! I’m at the big yearly conference of my subfield, Amplitudes, this year in Prague.

The conference poster included a picture of Prague’s famous clock, which is admittedly cool. But I think this computer-generated anachronism from Matt Schwartz’s machine learning talk is much more fun.

Amplitudes has grown, and keeps growing. The last time we met in person, there were 175 of us. This year, many people are skipping: some avoiding travel due to COVID, others just exhausted from a summer filled with long-postponed conferences. Nonetheless, we have more people here than then: 222 registered participants!

The large number of people means a large number of talks. Almost all were quite short, 25+5 minutes. Some speakers took advantage of the short length to deliver very accessible talks. Others seemed to think of the time limit as an excuse to cut short the introduction and dive right into technical details. We had just a few 40+5 minute talks, each a review from an adjacent field.

It’s been fun seeing people in person again. I think half of my conversations started with “It’s been a long time!” It’s easy for motivation to wane when you don’t have regular contact with the wider field, getting enthusiastic about shared goals and brainstorming big questions.

I’ll probably give a longer retrospective later: the packed schedule means I don’t have much time to write! But I can say that I’ve largely enjoyed this, the organizers were organized and the presenters presented and things felt a bit more like they ought to in the world.

Einstein-Years

Scott Aaronson recently published an interesting exchange on his blog Shtetl Optimized, between him and cognitive psychologist Steven Pinker. The conversation was about AI: Aaronson is optimistic (though not insanely so) Pinker is pessimistic (again, not insanely though). While fun reading, the whole thing would normally be a bit too off-topic for this blog, except that Aaronson’s argument ended up invoking something I do know a bit about: how we make progress in theoretical physics.

Aaronson was trying to respond to an argument of Pinker’s, that super-intelligence is too vague and broad to be something we could expect an AI to have. Aaronson asks us to imagine an AI that is nothing more or less than a simulation of Einstein’s brain. Such a thing isn’t possible today, and might not even be efficient, but it has the advantage of being something concrete we can all imagine. Aarsonson then suggests imagining that AI sped up a thousandfold, so that in one year it covers a thousand years of Einstein’s thought. Such an AI couldn’t solve every problem, of course. But in theoretical physics, surely such an AI could be safely described as super-intelligent: an amazing power that would change the shape of physics as we know it.

I’m not as sure of this as Aaronson is. We don’t have a machine that generates a thousand Einstein-years to test, but we do have one piece of evidence: the 76 Einstein-years the man actually lived.

Einstein is rightly famous as a genius in theoretical physics. His annus mirabilis resulted in five papers that revolutionized the field, and the next decade saw his theory of general relativity transform our understanding of space and time. Later, he explored what general relativity was capable of and framed challenges that deepened our understanding of quantum mechanics.

After that, though…not so much. For Einstein-decades, he tried to work towards a new unified theory of physics, and as far as I’m aware made no useful progress at all. I’ve never seen someone cite work from that period of Einstein’s life.

Aarsonson mentions simulating Einstein “at his peak”, and it would be tempting to assume that the unified theory came “after his peak”, when age had weakened his mind. But while that kind of thing can sometimes be an issue for older scientists, I think it’s overstated. I don’t think careers peak early because of “youthful brains”, and with the exception of genuine dementia I don’t think older physicists are that much worse-off cognitively than younger ones. The reason so many prominent older physicists go down unproductive rabbit-holes isn’t because they’re old. It’s because genius isn’t universal.

Einstein made the progress he did because he was the right person to make that progress. He had the right background, the right temperament, and the right interests to take others’ mathematics and take them seriously as physics. As he aged, he built on what he found, and that background in turn enabled him to do more great things. But eventually, the path he walked down simply wasn’t useful anymore. His story ended, driven to a theory that simply wasn’t going to work, because given his experience up to that point that was the work that interested him most.

I think genius in physics is in general like that. It can feel very broad because a good genius picks up new tricks along the way, and grows their capabilities. But throughout, you can see the links: the tools mastered at one age that turn out to be just right for a new pattern. For the greatest geniuses in my field, you can see the “signatures” in their work, hints at why they were just the right genius for one problem or another. Give one a thousand years, and I suspect the well would eventually run dry: the state of knowledge would no longer be suitable for even their breadth.

…of course, none of that really matters for Aaronson’s point.

A century of Einstein-years wouldn’t have found the Standard Model or String Theory, but a century of physicist-years absolutely did. If instead of a simulation of Einstein, your AI was a simulation of a population of scientists, generating new geniuses as the years go by, then the argument works again. Sure, such an AI would be much more expensive, much more difficult to build, but the first one might have been as well. The point of the argument is simply to show such a thing is possible.

The core of Aaronson’s point rests on two key traits of technology. Technology is replicable: once we know how to build something, we can build more of it. Technology is scalable: if we know how to build something, we can try to build a bigger one with more resources. Evolution can tap into both of these, but not reliably: just because it’s possible to build a mind a thousand times better at some task doesn’t mean it will.

That is why the possibility of AI leads to the possibility of super-intelligence. If we can make a computer that can do something, we can make it do that something faster. That something doesn’t have to be “general”, you can have programs that excel at one task or another. For each such task, with more resources you can scale things up: so anything a machine can do now, a later machine can probably do better. Your starting-point doesn’t necessarily even have to be efficient, or a good algorithm: bad algorithms will take longer to scale, but could eventually get there too.

The only question at that point is “how fast?” I don’t have the impression that’s settled. The achievements that got Pinker and Aarsonson talking, GPT-3 and DALL-E and so forth, impressed people by their speed, by how soon they got to capabilities we didn’t expect them to have. That doesn’t mean that something we might really call super-intelligence is close: that has to do with the details, with what your target is and how fast you can actually scale. And it certainly doesn’t mean that another approach might not be faster! (As a total outsider, I can’t help but wonder if current ML is in some sense trying to fit a cubic with straight lines.)

It does mean, though, that super-intelligence isn’t inconceivable, or incoherent. It’s just the recognition that technology is a master of brute force, and brute force eventually triumphs. If you want to think about what happens in that “eventually”, that’s a very important thing to keep in mind.

At New Ideas in Cosmology

The Niels Bohr Institute is hosting a conference this week on New Ideas in Cosmology. I’m no cosmologist, but it’s a pretty cool field, so as a local I’ve been sitting in on some of the talks. So far they’ve had a selection of really interesting speakers with quite a variety of interests, including a talk by Roger Penrose with his trademark hand-stippled drawings.

Including this old classic

One thing that has impressed me has been the “interdisciplinary” feel of the conference. By all rights this should be one “discipline”, cosmology. But in practice, each speaker came at the subject from a different direction. They all had a shared core of knowledge, common models of the universe they all compare to. But the knowledge they brought to the subject varied: some had deep knowledge of the mathematics of gravity, others worked with string theory, or particle physics, or numerical simulations. Each talk, aware of the varied audience, was a bit “colloquium-style“, introducing a framework before diving in to the latest research. Each speaker knew enough to talk to the others, but not so much that they couldn’t learn from them. It’s been unexpectedly refreshing, a real interdisciplinary conference done right.

At Mikefest

I’m at a conference this week of a very particular type: a birthday conference. When folks in my field turn 60, their students and friends organize a special conference for them, celebrating their research legacy. With COVID restrictions just loosening, my advisor Michael Douglas is getting a last-minute conference. And as one of the last couple students he graduated at Stony Brook, I naturally showed up.

The conference, Mikefest, is at the Institut des Hautes Études Scientifiques, just outside of Paris. Mike was a big supporter of the IHES, putting in a lot of fundraising work for them. Another big supporter, James Simons, was Mike’s employer for a little while after his time at Stony Brook. The conference center we’re meeting in is named for him.

You might have to zoom in to see that, though.

I wasn’t involved in organizing the conference, so it was interesting seeing differences between this and other birthday conferences. Other conferences focus on the birthday prof’s “family tree”: their advisor, their students, and some of their postdocs. We’ve had several talks from Mike’s postdocs, and one from his advisor, but only one from a student. Including him and me, three of Mike’s students are here: another two have had their work mentioned but aren’t speaking or attending.

Most of the speakers have collaborated with Mike, but only for a few papers each. All of them emphasized a broader debt though, for discussions and inspiration outside of direct collaboration. The message, again and again, is that Mike’s work has been broad enough to touch a wide range of people. He’s worked on branes and the landscape of different string theory universes, pure mathematics and computation, neuroscience and recently even machine learning. The talks generally begin with a few anecdotes about Mike, before pivoting into research talks on the speakers’ recent work. The recent-ness of the work is perhaps another difference from some birthday conferences: as one speaker said, this wasn’t just a celebration of Mike’s past, but a “welcome back” after his return from the finance world.

One thing I don’t know is how much this conference might have been limited by coming together on short notice. For other birthday conferences impacted by COVID (and I’m thinking of one in particular), it might be nice to have enough time to have most of the birthday prof’s friends and “academic family” there in person. As-is, though, Mike seems to be having fun regardless.

Happy Birthday Mike!

Congratulations to Syukuro Manabe, Klaus Hasselmann, and Giorgio Parisi!

The 2021 Nobel Prize in Physics was announced this week, awarded to Syukuro Manabe and Klaus Hasselmann for climate modeling and Giorgio Parisi for understanding a variety of complex physical systems.

Before this year’s prize was announced, I remember a few “water cooler chats” about who might win. No guess came close, though. The Nobel committee seems to have settled in to a strategy of prizes on a loosely linked “basket” of topics, with half the prize going to a prominent theorist and the other half going to two experimental, observational, or (in this case) computational physicists. It’s still unclear why they’re doing this, but regardless it makes it hard to predict what they’ll do next!

When I read the announcement, my first reaction was, “surely it’s not that Parisi?” Giorgio Parisi is known in my field for the Altarelli-Parisi equations (more properly known as the DGLAP equations, the longer acronym because, as is often the case in physics, the Soviets got there first). These equations are in some sense why the scattering amplitudes I study are ever useful at all. I calculate collisions of individual fundamental particles, like quarks and gluons, but a real particle collider like the LHC collides protons. Protons are messy, interacting combinations of quarks and gluons. When they collide you need not merely the equations describing colliding quarks and gluons, but those that describe their messy dynamics inside the proton, and in particular how those dynamics look different for experiments with different energies. The equation that describes that is the DGLAP equation.

As it turns out, Parisi is known for a lot more than the DGLAP equation. He is best known for his work on “spin glasses”, models of materials where quantum spins try to line up with each other, never quite settling down. He also worked on a variety of other complex systems, including flocks of birds!

I don’t know as much about Manabe and Hasselmann’s work. I’ve only seen a few talks on the details of climate modeling. I’ve seen plenty of talks on other types of computer modeling, though, from people who model stars, galaxies, or black holes. And from those, I can appreciate what Manabe and Hasselmann did. Based on those talks, I recognize the importance of those first one-dimensional models, a single column of air, especially back in the 60’s when computer power was limited. Even more, I recognize how impressive it is for someone to stay on the forefront of that kind of field, upgrading models for forty years to stay relevant into the 2000’s, as Manabe did. Those talks also taught me about the challenge of coupling different scales: how small effects in churning fluids can add up and affect the simulation, and how hard it is to model different scales at once. To use these effects to discover which models are reliable, as Hasselmann did, is a major accomplishment.

In Defense of Shitty Code

Scientific programming was in the news lately, when doubts were raised about a coronavirus simulation by researchers at Imperial College London. While the doubts appear to have been put to rest, doing so involved digging through some seriously messy code. The whole situation seems to have gotten a lot of people worried. If these people are that bad at coding, why should we trust their science?

I don’t know much about coronavirus simulations, my knowledge there begins and ends with a talk I saw last month. But I know a thing or two about bad scientific code, because I write it. My code is atrocious. And I’ve seen published code that’s worse.

Why do scientists write bad code?

In part, it’s a matter of training. Some scientists have formal coding training, but most don’t. I took two CS courses in college and that was it. Despite that lack of training, we’re expected and encouraged to code. Before I took those courses, I spent a summer working in a particle physics lab, where I was expected to pick up the C++-based interface pretty much on the fly. I don’t think there’s another community out there that has as much reason to code as scientists do, and as little training for it.

Would it be useful for scientists to have more of the tools of a trained coder? Sometimes, yeah. Version control is a big one, I’ve collaborated on papers that used Git and papers that didn’t, and there’s a big difference. There are coding habits that would speed up our work and lead to fewer dead ends, and they’re worth picking up when we have the time.

But there’s a reason we don’t prioritize “proper coding”. It’s because the things we’re trying to do, from a coding perspective, are really easy.

What, code-wise, is a coronavirus simulation? A vector of “people”, really just simple labels, all randomly infecting each other and recovering, with a few parameters describing how likely they are to do so and how long it takes. What do I do, code-wise? Mostly, giant piles of linear algebra.

These are not some sort of cutting-edge programming tasks. These are things people have been able to do since the dawn of computers. These are things that, when you screw them up, become quite obvious quite quickly.

Compared to that, the everyday tasks of software developers, like making a reliable interface for users, or efficient graphics, are much more difficult. They’re tasks that really require good coding practices, that just can’t function without them.

For us, the important part is not the coding itself, but what we’re doing with it. Whatever bugs are in a coronavirus simulation, they will have much less impact than, for example, the way in which the simulation includes superspreaders. Bugs in my code give me obviously wrong answers, bad scientific assumptions are much harder for me to root out.

There’s an exception that proves the rule here, and it’s that, when the coding task is actually difficult, scientists step up and write better code. Scientists who want to run efficiently on supercomputers, who are afraid of numerical error or need to simulate on many scales at once, these people learn how to code properly. The code behind the LHC still might be jury-rigged by industry standards, but it’s light-years better than typical scientific code.

I get the furor around the Imperial group’s code. I get that, when a government makes a critical decision, you hope that their every input is as professional as possible. But without getting too political for this blog, let me just say that whatever your politics are, if any of it is based on science, it comes from code like this. Psychology studies, economic modeling, polling…they’re using code, and it’s jury-rigged to hell. Scientists just have more important things to worry about.

Facts About Our Capabilities Are Facts About the World

A paper leaked from Google last week claimed that their researchers had achieved “quantum supremacy”, the milestone at which a quantum computer performs a calculation faster than any existing classical computer. Scott Aaronson has a great explainer about this. The upshot is that Google’s computer is much too small to crack all our encryptions (only 53 qubits, the equivalent of bits for quantum computers), but it still appears to be a genuine quantum computer doing a genuine quantum computation that is genuinely not feasible otherwise.

How impressed should we be about this?

On one hand, the practical benefits of a 53-qubit computer are pretty minimal. Scott discusses some applications: you can generate random numbers, distributed in a way that will let others verify that they are truly random, the kind of thing it’s occasionally handy to do in cryptography. Still, by itself this won’t change the world, and compared to the quantum computing hype I can understand if people find this underwhelming.

On the other hand, as Scott says, this falsifies the Extended Church-Turing Thesis! And that sounds pretty impressive, right?

Ok, I’m actually just re-phrasing what I said before. The Extended Church-Turing Thesis proposes that a classical computer (more specifically, a probabilistic Turing machine) can efficiently simulate any reasonable computation. Falsifying it means finding something that a classical computer cannot compute efficiently but another sort of computer (say, a quantum computer) can. If the calculation Google did truly can’t be done efficiently on a classical computer (this is not proven, though experts seem to expect it to be true) then yes, that’s what Google claims to have done.

So we get back to the real question: should we be impressed by quantum supremacy?

Well, should we have been impressed by the Higgs?

The detection of the Higgs boson in 2012 hasn’t led to any new Higgs-based technology. No-one expected it to. It did teach us something about the world: that the Higgs boson exists, and that it has a particular mass. I think most people accept that that’s important: that it’s worth knowing how the world works on a fundamental level.

Google may have detected the first-known violation of the Extended Church-Turing Thesis. This could eventually lead to some revolutionary technology. For now, though, it hasn’t. Instead, it teaches us something about the world.

It may not seem like it, at first. Unlike the Higgs boson, “Extended Church-Turing is false” isn’t a law of physics. Instead, it’s a fact about our capabilities. It’s a statement about the kinds of computers we can and cannot build, about the kinds of algorithms we can and cannot implement, the calculations we can and cannot do.

Facts about our capabilities are still facts about the world. They’re still worth knowing, for the same reasons that facts about the world are still worth knowing. They still give us a clearer picture of how the world works, which tells us in turn what we can and cannot do. According to the leaked paper, Google has taught us a new fact about the world, a deep fact about our capabilities. If that’s true we should be impressed, even without new technology.