Author Archives: 4gravitons

Bonus Info on the LHC and Beyond

Three of my science journalism pieces went up last week!

(This is a total coincidence. One piece was a general explainer “held in reserve” for a nice slot in the schedule, one was a piece I drafted in February, while the third I worked on in May. In journalism, things take as long as they take.)

The shortest piece, at Quanta Magazine, was an explainer about the two types of particles in physics: bosons, and fermions.

I don’t have a ton of bonus info here, because of how tidy the topic is, so just two quick observations.

First, I have the vague impression that Bose, bosons’ namesake, is “claimed” by both modern-day Bangladesh and India. I had friends in grad school who were proud of their fellow physicist from Bangladesh, but while he did his most famous work in Dhaka, he was born and died in Calcutta. Since both were under British India for most of his life, these things likely get complicated.

Second, at the end of the piece I mention a “world on a wire” where fermions and bosons are the same. One example of such a “wire” is a string, like in string theory. One thing all young string theorists learn is “bosonization”: the idea that, in a 1+1-dimensional world like a string, you can re-write any theory with fermions as a theory with bosons, as well as vice versa. This has important implications for how string theory is set up.

Next, in Ars Technica, I had a piece about how LHC physicists are using machine learning to untangle the implications of quantum interference.

As a journalist, it’s really easy to fall into a trap where you give the main person you interview too much credit: after all, you’re approaching the story from their perspective. I tried to be cautious about this, only to be stymied when literally everyone else I interviewed praised Aishik Ghosh to the skies and credited him with being the core motivating force behind the project. So I shrugged my shoulders and followed suit. My understanding is that he has been appropriately rewarded and will soon be a professor at Georgia Tech.

I didn’t list the inventors of the NSBI method that Ghosh and co. used, but names like Kyle Cranmer and Johann Brehmer tend to get bandied about. It’s a method that was originally explored for a more general goal, trying to characterize what the Standard Model might be missing, while the work I talk about in the piece takes it in a new direction, closer to the typical things the ATLAS collaboration looks for.

I also did not say nearly as much as I was tempted to about how the ATLAS collaboration publishes papers, which was honestly one of the most intriguing parts of the story for me. There is a huge amount of review that goes on inside ATLAS before one of their papers reaches the outside world, way more than there ever is in a journal’s peer review process. This is especially true for “physics papers”, where ATLAS is announcing a new conclusion about the physical world, as ATLAS’s reputation stands on those conclusions being reliable. That means starting with an “internal note” that’s hundreds of pages long (and sometimes over a thousand), an editorial board that manages the editing process, disseminating the paper to the entire collaboration for comment, and getting specific experts and institute groups within the collaboration to read through the paper in detail. The process is a bit less onerous for “technical papers”, which describe a new method, not a new conclusion about the world. Still, it’s cumbersome enough that for those papers, often scientists don’t publish them “within ATLAS” at all, instead releasing them independently. The results I reported on are special because they involved a physics paper and a technical paper, both within the ATLAS collaboration process. Instead of just working with partial or simplified data, they wanted to demonstrate the method on a “full analysis”, with all the computation and human coordination that requires. Normally, ATLAS wouldn’t go through the whole process of publishing a physics paper without basing it on new data, but this was different: the method had the potential to be so powerful that the more precise results would be worth stating as physics results alone.

(Also, for the people in the comments worried about training a model on old data: that’s not what they did. In physics, they don’t try to train a neural network model to predict the results of colliders, such a model wouldn’t tell us anything useful. They run colliders to tell us whether what they see matches the analytic, Standard, model. The neural network is trained to predict not what the experiment will say, but what the Standard Model will say, as we can usually only figure that out through time-consuming simulations. So it’s trained on (new) simulations, not on experimental data.)

Finally, on Friday I had a piece in Physics Today about the European Strategy for Particle Physics (or ESPP), and in particular, plans for the next big collider.

Before I even started working on this piece, I saw a thread by Patrick Koppenburg on some of the 263 documents submitted for the ESPP update. While my piece ended up mostly focused on the big circular collider plan that most of the field is converging on (the future circular collider, or FCC), Koppenburg’s thread was more wide-ranging, meant to illustrate the breadth of ideas under discussion. Some of that discussion is about the LHC’s current plans, like its “high-luminosity” upgrade that will see it gather data at much higher rates up until 2040. Some of it is assessing broader concerns, which it may surprise some of you to learn includes sustainability: yes, there are more or less sustainable ways to build giant colliders.

The most fun part of the discussion, though, concerns all of the other collider proposals.

Some report progress on new technologies. Muon colliders are the most famous of these, but there are other proposals that would specifically help with a linear collider. I never did end up understanding what Cooled Copper Colliders are all about, beyond that they let you get more energy in a smaller machine without super-cooling. If you know about them, chime in in the comments! Meanwhile, plasma wakefield acceleration could accelerate electrons on a wave of plasma. This has the disadvantage that you want to collide electrons and positrons, and if you try to stick a positron in plasma it will happily annihilate with the first electron it meets. So what do you do? You go half-and-half, with the HALHF project: speed up the electron with a plasma wakefield, accelerate the positron normally, and have them meet in the middle.

Others are backup plans, or “budget options”, where CERN could get a bit better measurements on some parameters if they can’t stir up the funding to measure the things they really want. They could put electrons and positrons into the LHC tunnel instead of building a new one, for a weaker machine that could still study the Higgs boson to some extent. They could use a similar experiment to produce Z bosons instead, which could serve as a bridge to a different collider project. Or, they could collider the LHC’s proton beam with an electron beam, for an experiment that mixes advantages and disadvantages of some of the other approaches.

While working on the piece, one resource I found invaluable was this colloquium talk by Tristan du Pree, where he goes through the 263 submissions and digs up a lot of interesting numbers and commentary. Read the slides for quotes from the different national inputs and “solo inputs” with comments from particular senior scientists. I used that talk to get a broad impression of what the community was feeling, and it was interesting how well it was reflected in the people I interviewed. The physicist based in Switzerland felt the most urgency for the FCC plan, while the Dutch sources were more cautious, with other Europeans firmly in the middle.

Going over the FCC report itself, one thing I decided to leave out of the discussion was the cost-benefit analysis. There’s the potential for a cute sound-bite there, “see, the collider is net positive!”, but I’m pretty skeptical of the kind of analysis they’re doing there, even if it is standard practice for government projects. Between the biggest benefits listed being industrial benefits to suppliers and early-career researcher training (is a collider unusually good for either of those things, compared to other ways we spend money?) and the fact that about 10% of the benefit is the science itself (where could one possibly get a number like that?), it feels like whatever reasoning is behind this is probably the kind of thing that makes rigor-minded economists wince. I wasn’t able to track down the full calculation though, so I really don’t know, maybe this makes more sense than it looks.

I think a stronger argument than anything along those lines is a much more basic point, about expertise. Right now, we have a community of people trying to do something that is not merely difficult, but fundamental. This isn’t like sending people to space, where many of the engineering concerns will go away when we can send robots instead. This is fundamental engineering progress in how to manipulate the forces of nature (extremely powerful magnets, high voltages) and process huge streams of data. Pushing those technologies to the limit seems like it’s going to be relevant, almost no matter what we end up doing. That’s still not putting the science first and foremost, but it feels a bit closer to an honest appraisal of what good projects like this do for the world.

Why Solving the Muon Puzzle Doesn’t Solve the Puzzle

You may have heard that the muon g-2 problem has been solved.

Muons are electrons’ heavier cousins. As spinning charged particles, they are magnetic, the strength of that magnetism characterized by a number denoted “g”. If you were to guess this number from classical physics alone, you’d conclude it should be 2, but quantum mechanics tweaks it. The leftover part, “g-2”, can be measured, and predicted, with extraordinary precision, which ought to make it an ideal test: if our current understanding of the particle physics, called the Standard Model, is subtly wrong, the difference might be noticeable there.

And for a while, it looked like such a difference was indeed noticeable. Extremely precise experiments over the last thirty years have consistently found a number slightly different from the extremely precise calculations, different enough that it seemed quite unlikely to be due to chance.

Now, the headlines are singing a different tune.

What changed?

That headline might make you think the change was an experimental result, a new measurement that changed the story. It wasn’t, though. There is a new, more precise measurement, but it agrees with the old measurements.

So the change has to be in the calculations, right? They did a new calculation, corrected a mistake or just pushed up their precision, and found that the Standard Model matches the experiment after all?

…sort of, but again, not really. The group of theoretical physicists associated with the experiment did release new, more accurate calculations. But it wasn’t the new calculations, by themselves, that made a difference. Instead, it was a shift in what kind of calculations they used…or even more specifically, what kind of calculations they trusted.

Parts of the calculation of g-2 can be done with Feynman diagrams, those photogenic squiggles you see on physicists’ blackboards. That part is very precise, and not especially controversial. However, Feynman diagrams only work well when forces between particles are comparatively weak. They’re great for electromagnetism, even better for the weak nuclear force. But for the strong nuclear force, the one that holds protons and neutrons together, you often need a different method.

For g-2, that used to be done via a “data-driven” method. Physicists measured different things, particles affected by the strong nuclear force in different ways, and used that to infer how the strong force would affect g-2. By getting a consistent picture from different experiments, they were reasonably confident that they had the right numbers.

Back in 2020, though, a challenger came to the scene, with another method. Called lattice QCD, this method involves building gigantic computer simulations of the effect of the strong force. People have been doing lattice QCD since the 1970’s, and the simulations have been getting better and better, until in 2020, a group managed to calculate the piece of the g-2 calculation that had until then been done by the data-driven method.

The lattice group found a very different result than what had been found previously. Instead of a wild disagreement with experiment, their calculation agreed. According to them, everything was fine, the muon g-2 was behaving exactly as the Standard Model predicted.

For some of us, that’s where the mystery ended. Clearly, something must be wrong with the data-driven method, not with the Standard Model. No more muon puzzle.

But the data-driven method wasn’t just a guess, it was being used for a reason. A significant group of physicists found the arguments behind it convincing. Now, there was a new puzzle: figuring out why the data-driven method and lattice QCD disagree.

Five years later, has that mystery been solved? Is that, finally, what the headlines are about?

Again, not really, no.

The theorists associated with the experiment have decided to trust lattice QCD, not the data-driven method. But they don’t know what went wrong, exactly.

Instead, they’ve highlighted cracks in the data-driven method. The way the data-driven method works, it brings together different experiments to try to get a shared picture. But that shared picture has started to fall apart. A new measurement by a different experiment doesn’t fit into the system: the data-driven method now “has tensions”, as physicists say. It’s no longer possible to combine all experiments into a shared picture they way they used to. Meanwhile, lattice QCD has gotten even better, reaching even higher precision. From the perspective of the theorists associated with the muon g-2 experiment, switching methods is now clearly the right call.

But does that mean they solved the puzzle?

If you were confident that lattice QCD is the right approach, then the puzzle was already solved in 2020. All that changed was the official collaboration finally acknowledging that.

And if you were confident that the data-driven method was the right approach, then the puzzle is even worse. Now, there are tensions within the method itself…but still no explanation of what went wrong! If you had good reasons to think the method should work, you still have those good reasons. Now you’re just…more puzzled.

I am reminded of another mystery, a few years back, when an old experiment announced a dramatically different measurement for the mass of the W boson. Then, I argued the big mystery was not how the W boson’s mass had changed (it hadn’t), but how they came to be so confident in a result so different from what others, also confidently, had found. In physics, our confidence is encoded in numbers, estimated and measured and tested and computed. If we’re not estimating that confidence correctly…then that’s the real mystery, the real puzzle. One much more important to solve.


Also, I had two more pieces out this week! In Quanta I have a short explainer about bosons and fermions, while at Ars Technica I have a piece about machine learning at the LHC. I may have a “bonus info” post on the latter at some point, I have to think about whether I have enough material for it.

Amplitudes 2025 This Week

Summer is conference season for academics, and this week held my old sub-field’s big yearly conference, called Amplitudes. This year, it was in Seoul at Seoul National University, the first time the conference has been in Asia.

(I wasn’t there, I don’t go to these anymore. But I’ve been skimming slides in my free time, to give you folks the updates you crave. Be forewarned that conference posts like these get technical fast, I’ll be back to my usual accessible self next week.)

There isn’t a huge amplitudes community in Korea, but it’s bigger than it was back when I got started in the field. Of the organizers, Kanghoon Lee of the Asia Pacific Center for Theoretical Physics and Sangmin Lee of Seoul National University have what I think of as “core amplitudes interests”, like recursion relations and the double-copy. The other Korean organizers are from adjacent areas, work that overlaps with amplitudes but doesn’t show up at the conference each year. There was also a sizeable group of organizers from Taiwan, where there has been a significant amplitudes presence for some time now. I do wonder if Korea was chosen as a compromise between a conference hosted in Taiwan or in mainland China, where there is also quite a substantial amplitudes community.

One thing that impresses me every year is how big, and how sophisticated, the gravitational-wave community in amplitudes has grown. Federico Buccioni’s talk began with a plot that illustrates this well (though that wasn’t his goal):

At the conference Amplitudes, dedicated to the topic of scattering amplitudes, there were almost as many talks with the phrase “black hole” in the title as there were with “scattering” or “amplitudes”! This is for a topic that did not even exist in the subfield when I got my PhD eleven years ago.

With that said, gravitational wave astronomy wasn’t quite as dominant at the conference as Buccioni’s bar chart suggests. There were a few talks each day on the topic: I counted seven in total, excluding any short talks on the subject in the gong show. Spinning black holes were a significant focus, central to Jung-Wook Kim’s, Andres Luna’s and Mao Zeng’s talks (the latter two showing some interesting links between the amplitudes story and classic ideas in classical mechanics) and relevant in several others, with Riccardo Gonzo, Miguel Correia, Ira Rothstein, and Enrico Herrmann’s talks showing not just a wide range of approaches, but an increasing depth of research in this area.

Herrmann’s talk in particular dealt with detector event shapes, a framework that lets physicists think more directly about what a specific particle detector or observer can see. He applied the idea not just to gravitational waves but to quantum gravity and collider physics as well. The latter is historically where this idea has been applied the most thoroughly, as highlighted in Hua Xing Zhu’s talk, where he used them to pick out particular phenomena of interest in QCD.

QCD is, of course, always of interest in the amplitudes field. Buccioni’s talk dealt with the theory’s behavior at high-energies, with a nice example of the “maximal transcendentality principle” where some quantities in QCD are identical to quantities in N=4 super Yang-Mills in the “most transcendental” pieces (loosely, those with the highest powers of pi). Andrea Guerreri’s talk also dealt with high-energy behavior in QCD, trying to address an experimental puzzle where QCD results appeared to violate a fundamental bound all sensible theories were expected to obey. By using S-matrix bootstrap techniques, they clarify the nature of the bound, finding that QCD still obeys it once correctly understood, and conjecture a weird theory that should be possible to frame right on the edge of the bound. The S-matrix bootstrap was also used by Alexandre Homrich, who talked about getting the framework to work for multi-particle scattering.

Heribertus Bayu Hartanto is another recent addition to Korea’s amplitudes community. He talked about a concrete calculation, two-loop five-particle scattering including top quarks, a tricky case that includes elliptic curves.

When amplitudes lead to integrals involving elliptic curves, many standard methods fail. Jake Bourjaily’s talk raised a question he has brought up again and again: what does it mean to do an integral for a new type of function? One possible answer is that it depends on what kind of numerics you can do, and since more general numerical methods can be cumbersome one often needs to understand the new type of function in more detail. In light of that, Stephen Jones’ talk was interesting in taking a common problem often cited with generic approaches (that they have trouble with the complex numbers introduced by Minkowski space) and finding a more natural way in a particular generic approach (sector decomposition) to take them into account. Giulio Salvatori talked about a much less conventional numerical method, linked to the latest trend in Nima-ology, surfaceology. One of the big selling points of the surface integral framework promoted by people like Salvatori and Nima Arkani-Hamed is that it’s supposed to give a clear integral to do for each scattering amplitude, one which should be amenable to a numerical treatment recently developed by Michael Borinsky. Salvatori can currently apply the method only to a toy model (up to ten loops!), but he has some ideas for how to generalize it, which will require handling divergences and numerators.

Other approaches to the “problem of integration” included Anna-Laura Sattelberger’s talk that presented a method to find differential equations for the kind of integrals that show up in amplitudes using the mathematical software Macaulay2, including presenting a package. Matthias Wilhelm talked about the work I did with him, using machine learning to find better methods for solving integrals with integration-by-parts, an area where two other groups have now also published. Pierpaolo Mastrolia talked about integration-by-parts’ up-and-coming contender, intersection theory, a method which appears to be delving into more mathematical tools in an effort to catch up with its competitor.

Sometimes, one is more specifically interested in the singularities of integrals than their numerics more generally. Felix Tellander talked about a geometric method to pin these down which largely went over my head, but he did have a very nice short description of the approach: “Describe the singularities of the integrand. Find a map representing integration. Map the singularities of the integrand onto the singularities of the integral.”

While QCD and gravity are the applications of choice, amplitudes methods germinate in N=4 super Yang-Mills. Ruth Britto’s talk opened the conference with an overview of progress along those lines before going into her own recent work with one-loop integrals and interesting implications of ideas from cluster algebras. Cluster algebras made appearances in several other talks, including Anastasia Volovich’s talk which discussed how ideas from that corner called flag cluster algebras may give insights into QCD amplitudes, though some symbol letters still seem to be hard to track down. Matteo Parisi covered another idea, cluster promotion maps, which he thinks may help pin down algebraic symbol letters.

The link between cluster algebras and symbol letters is an ongoing mystery where the field is seeing progress. Another symbol letter mystery is antipodal duality, where flipping an amplitude like a palindrome somehow gives another valid amplitude. Lance Dixon has made progress in understanding where this duality comes from, finding a toy model where it can be understood and proved.

Others pushed the boundaries of methods specific to N=4 super Yang-Mills, looking for novel structures. Song He’s talk pushes an older approach by Bourjaily and collaborators up to twelve loops, finding new patterns and connections to other theories and observables. Qinglin Yang bootstraps Wilson loops with a Lagrangian insertion, adding a side to the polygon used in previous efforts and finding that, much like when you add particles to amplitudes in a bootstrap, the method gets stricter and more powerful. Jaroslav Trnka talked about work he has been doing with “negative geometries”, an odd method descended from the amplituhedron that looks at amplitudes from a totally different perspective, probing a bit of their non-perturbative data. He’s finding more parts of that setup that can be accessed and re-summed, finding interestingly that multiple-zeta-values show up in quantities where we know they ultimately cancel out. Livia Ferro also talked about a descendant of the amplituhedron, this time for cosmology, getting differential equations for cosmological observables in a particular theory from a combinatorial approach.

Outside of everybody’s favorite theories, some speakers talked about more general approaches to understanding the differences between theories. Andreas Helset covered work on the geometry of the space of quantum fields in a theory, applying the method to a general framework for characterizing deviations from the standard model called the SMEFT. Jasper Roosmale Nepveu also talked about a general space of theories, thinking about how positivity (a trait linked to fundamental constraints like causality and unitarity) gets tangled up with loop effects, and the implications this has for renormalization.

Soft theorems, universal behavior of amplitudes when a particle has low energy, continue to be a trendy topic, with Silvia Nagy showing how the story continues to higher orders and Sangmin Choi investigating loop effects. Callum Jones talks about one of the more powerful results from the soft limit, Weinberg’s theorem showing the uniqueness of gravity. Weinberg’s proof was set up in Minkowski space, but we may ultimately live in curved, de Sitter space. Jones showed how the ideas Weinberg explored generalize in de Sitter, using some tools from the soft-theorem-inspired field of dS/CFT. Julio Parra-Martinez, meanwhile, tied soft theorems to another trendy topic, higher symmetries, a more general notion of the usual types of symmetries that physicists have explored in the past. Lucia Cordova reported work that was not particularly connected to soft theorems but was connected to these higher symmetries, showing how they interact with crossing symmetry and the S-matrix bootstrap.

Finally, a surprisingly large number of talks linked to Kevin Costello and Natalie Paquette’s work with self-dual gauge theories, where they found exact solutions from a fairly mathy angle. Paquette gave an update on her work on the topic, while Alfredo Guevara talked about applications to black holes, comparing the power of expanding around a self-dual gauge theory to that of working with supersymmetry. Atul Sharma looked at scattering in self-dual backgrounds in work that merges older twistor space ideas with the new approach, while Roland Bittelson talked about calculating around an instanton background.


Also, I had another piece up this week at FirstPrinciples, based on an interview with the (outgoing) president of the Sloan Foundation. I won’t have a “bonus info” post for this one, as most of what I learned went into the piece. But if you don’t know what the Sloan Foundation does, take a look! I hadn’t known they funded Jupyter notebooks and Hidden Figures, or that they introduced Kahneman and Tversky.

Bonus info for Reversible Computing and Megastructures

After some delay, a bonus info post!

At FirstPrinciples.org, I had a piece covering work by engineering professor Colin McInnes on stability of Dyson spheres and ringworlds. This was a fun one to cover, mostly because of how it straddles the borderline between science fiction and practical physics and engineering. McInnes’s claim to fame is work on solar sails, which seem like a paradigmatic example of that kind of thing: a common sci-fi theme that’s surprisingly viable. His work on stability was interesting to me because it’s the kind of work that a century and a half ago would have been paradigmatic physics. Now, though, very few physicists work on orbital mechanics, and a lot of the core questions have passed on to engineering. It’s fascinating to see how these classic old problems can still have undiscovered solutions, and how the people best equipped to find them now are tinkerers practicing their tools instead of cutting-edge mathematicians.

At Quanta Magazine, I had a piece about reversible computing. Readers may remember I had another piece on that topic at the end of March, a profile on the startup Vaire Computing at FirstPrinciples.org. That piece talked about FirstPrinciples, but didn’t say much about reversible computing. I figured I’d combine the “bonus info” for both posts here.

Neither piece went into much detail about the engineering involved, as it didn’t really make sense in either venue. One thing that amused me a bit is that the core technology that drove Vaire into action is something that actually should be very familiar to a physics or engineering student: a resonator. Theirs is obviously quite a bit more sophisticated than the base model, but at its heart it’s doing the same thing: storing charge and controlling frequency. It turns out that those are both essential to making reversible computers work: you need to store charge so it isn’t lost to ground when you empty a transistor, and you need to control the frequency so you can have waves with gentle transitions instead of the more sharp corners of the waves used in normal computers, thus wasting less heat in rapid changes of voltage. Vaire recently announced they’re getting 50% charge recovery from their test chips, and they’re working on raising that number.

Originally, the Quanta piece was focused more on reversible programming than energy use, as the energy angle seemed a bit more physics-focused than their computer science desk usually goes. The emphasis ended up changing as I worked on the draft, but it meant that an interesting parallel story got lost on the cutting-room floor. There’s a community of people who study reversible computing not from the engineering side, but from the computer science side, studying reversible logic and reversible programming languages. It’s a pursuit that goes back to the 1980’s, where at Caltech around when Feynman was teaching his course on the physics of computing a group of students were figuring out how to set up a reversible programming language. Called Janus, they sent their creation to Landauer, and the letter ended up with Michael Frank after Landauer died. There’s a lovely quote from it regarding their motivation: “We did it out of curiosity over whether such an odd animal as this was possible, and because we were interested in knowing where we put information when we programmed. Janus forced us to pay attention to where our bits went since none could be thrown away.”

Being forced to pay attention to information, in turn, is what has animated the computer science side of the reversible computing community. There are applications to debugging, where you can run code backwards when it gets stuck, to encryption and compression, where you want to be able to recover the information you hid away, and to security, where you want to keep track of information to make sure a hacker can’t figure out things they shouldn’t. Also, for a lot of these people, it’s just a fun puzzle. Early on my attention was caught by a paper by Hannah Earley describing a programming language called Alethe, a word you might recognize from the Greek word for truth, which literally means something like “not-forgetting”.

(Compression is particularly relevant for the “garbage data” you need to output in a reversible computation. If you want to add two numbers reversibly, naively you need to keep both input numbers and their output, but you can be more clever than that and just keep one of the inputs since you can subtract to find the other. There are a lot of substantially more clever tricks in this vein people have figured out over the years.)

I didn’t say anything about the other engineering approaches to reversible computing, that try to do something outside of traditional computer chips. There’s DNA computing, which tries to compute with a bunch of DNA in solution. There’s the old concept of ballistic reversible computing, where you imagine a computer that runs like a bunch of colliding billiard balls, conserving energy. Coordinating such a computer can be a nightmare, and early theoretical ideas were shown to be disrupted by something as tiny as a few stray photons from a distant star. But people like Frank figured out ways around the coordination problem, and groups have experimented with superconductors as places to toss those billiard balls around. The early billiard-inspired designs also had a big impact on quantum computing, where you need reversible gates and the only irreversible operation is the measurement. The name “Toffoli” comes up a lot in quantum computing discussions, I hadn’t known before this that Toffoli gates were originally for reversible computing in general, not specifically quantum computing.

Finally, I only gestured at the sci-fi angle. For reversible computing’s die-hards, it isn’t just a way to make efficient computers now. It’s the ultimate future of the technology, the kind of energy-efficiency civilization will need when we’re covering stars with shells of “computronium” full of busy joyous artificial minds.

And now that I think about it, they should chat with McInnes. He can tell them the kinds of stars they should build around.

Branching Out, and Some Ground Rules

In January, my time at the Niels Bohr Institute ended. Instead of supporting myself by doing science, as I’d done the last thirteen or so years, I started making a living by writing, doing science journalism.

That work picked up. My readers here have seen a few of the pieces already, but there are lots more in the pipeline, getting refined by editors or waiting to be published. It’s given me a bit of income, and a lot of visibility.

That visibility, in turn, has given me new options. It turns out that magazines aren’t the only companies interested in science writing, and journalism isn’t the only way to write for a living. Companies that invest in science want a different kind of writing, one that builds their reputation both with the public and with the scientific community. And as I’ve discovered, if you have enough of a track record, some of those companies will reach out to you.

So I’m branching out, from science journalism to science communications consulting, advising companies how to communicate science. I’ve started working with an exciting client, with big plans for the future. If you follow me on LinkedIn, you’ll have seen a bit about who they are and what I’ll be doing for them.

Here on the blog, I’d like to maintain a bit more separation. Blogging is closer to journalism, and in journalism, one ought to be careful about conflicts of interest. The advice I’ve gotten is that it’s good to establish some ground rules, separating my communications work from my journalistic work, since I intend to keep doing both.

So without further ado, my conflict of interest rules:

  • I will not write in a journalistic capacity about my consulting clients, or their direct competitors.
  • I will not write in a journalistic capacity about the technology my clients are investing in, except in extremely general terms. (For example, most businesses right now are investing in AI. I’ll still write about AI in general, but not about any particular AI technologies my clients are pursuing.)
  • I will more generally maintain a distinction between areas I cover journalistically and areas where I consult. Right now, this means I avoid writing in a journalistic capacity about:
    • Health/biomedical topics
    • Neuroscience
    • Advanced sensors for medical applications

I plan to update these rules over time as I get a better feeling for what kinds of conflict of interest risks I face and what my clients are comfortable with. I now have a Page for this linked in the top menu, clients and editors can check there to see my current conflict of interest rules.

In Scientific American, With a Piece on Vacuum Decay

I had a piece in Scientific American last week. It’s paywalled, but if you’re a subscriber there you can see it, or you can buy the print magazine.

(I also had two pieces out in other outlets this week. I’ll be saying more about them…in a couple weeks.)

The Scientific American piece is about an apocalyptic particle physics scenario called vacuum decay. It’s a topic I covered last year in Quanta Magazine, an unlikely event where the Higgs field which gives fundamental particles their mass changes value, suddenly making all other particles much more massive and changing physics as we know it. It’s a change that physicists think would start as a small bubble and spread at (almost) the speed of light, covering the universe.

What I wrote for Quanta was a short news piece covering a small adjustment to the calculation, one that made the chance of vacuum decay slightly more likely. (But still mind-bogglingly small, to be clear.)

Scientific American asked for a longer piece, and that gave me space to dig deeper. I was able to say more about how vacuum decay works, with a few metaphors that I think should make it a lot easier to understand. I also got to learn about some new developments, in particular, an interesting story about how tiny primordial black holes could make vacuum decay dramatically more likely.

One thing that was a bit too complicated to talk about were the puzzles involved in trying to calculate these chances. In the article, I mention a calculation of the chance of vacuum decay by a team including Matthew Schwartz. That calculation wasn’t the first to estimate the chance of vacuum decay, and it’s not the most recent update either. Instead, I picked it because Schwartz’s team approached the question in what struck me as a more reliable way, trying to cut through confusion by asking the most basic question you can in a quantum theory: given that now you observe X, what’s the chance that later you observe Y? Figuring out how to turn vacuum decay into that kind of question correctly is tricky (for example, you need to include the possibility that vacuum decay happens, then reverses, then happens again).

The calculations of black holes speeding things up didn’t work things out in quite as much detail. I like to think I’ve made a small contribution by motivating them to look at Schwartz’s work, which might spawn a more rigorous calculation in future. When I talked to Schwartz, he wasn’t even sure whether the picture of a bubble forming in one place and spreading at light speed is correct: he’d calculated the chance of the initial decay, but hadn’t found a similarly rigorous way to think about the aftermath. So even more than the uncertainty I talk about in the piece, the questions about new physics and probability, there is even some doubt about whether the whole picture really works the way we’ve been imagining it.

That makes for a murky topic! But it’s also a flashy one, a compelling story for science fiction and the public imagination, and yeah, another motivation to get high-precision measurements of the Higgs and top quark from future colliders! (If maybe not quite the way this guy said it.)

Publishing Isn’t Free, but SciPost Makes It Cheaper

I’ve mentioned SciPost a few times on this blog. They’re an open journal in every sense you could think of: diamond open-access scientific publishing on an open-source platform, run with open finances. They even publish their referee reports. They’re aiming to cover not just a few subjects, but a broad swath of academia, publishing scientists’ work in the most inexpensive and principled way possible and challenging the dominance of for-profit journals.

And they’re struggling.

SciPost doesn’t charge university libraries for access, they let anyone read their articles for free. And they don’t charge authors Article Processing Charges (or APCs), they let anyone publish for free. All they do is keep track of which institutions those authors are affiliated with, calculate what fraction of their total costs comes from them, and post it in a nice searchable list on their website.

And amazingly, for the last nine years, they’ve been making that work.

SciPost encourages institutions to pay their share, mostly by encouraging authors to bug their bosses until they do. SciPost will also quite happily accept more than an institution’s share, and a few generous institutions do just that, which is what has kept them afloat so far. But since nothing compels anyone to pay, most organizations simply don’t.

From an economist’s perspective, this is that most basic of problems, the free-rider problem. People want scientific publication to be free, but it isn’t. Someone has to pay, and if you don’t force someone to do it, then the few who pay will be exploited by the many who don’t.

There’s more worth saying, though.

First, it’s worth pointing out that SciPost isn’t paying the same cost everyone else pays to publish. SciPost has a stripped-down system, without any physical journals or much in-house copyediting, based entirely on their own open-source software. As a result, they pay about 500 euros per article. Compare this to the fees negotiated by particle physics’ SCOAP3 agreement, which average to closer to 1000 euros, and realize that those fees are on the low end: for-profit journals tend to make their APCs higher in order to, well, make a profit.

(By the way, while it’s tempting to think of for-profit journals as greedy, I think it’s better to think of them as not cost-effective. Profit is an expense, like the interest on a loan: a payment to investors in exchange for capital used to set up the business. The thing is, online journals don’t seem to need that kind of capital, especially when they’re based on code written by academics in their spare time. So they can operate more cheaply as nonprofits.)

So when an author publishes in SciPost instead of a journal with APCs, they’re saving someone money, typically their institution or their grant. This would happen even if their institution paid their share of SciPost’s costs. (But then they would pay something rather than nothing, hence free-rider problem.)

If an author instead would have published in a closed-access journal, the kind where you have to pay to read the articles and university libraries pay through the nose to get access? Then you don’t save any money at all, your library still has to pay for the journal. You only save money if everybody at the institution stops using the journal. This one is instead a collective action problem.

Collective action problems are hard, and don’t often have obvious solutions. Free-rider problems do suggest an obvious solution: why not just charge?

In SciPost’s case, there are philosophical commitments involved. Their desire to attribute costs transparently and equally means dividing a journal’s cost among all its authors’ institutions, a cost only fully determined at the end of the year, which doesn’t make for an easy invoice.

More to the point, though, charging to publish is directly against what the Open Access movement is about.

That takes some unpacking, because of course, someone does have to pay. It probably seems weird to argue that institutions shouldn’t have to pay charges to publish papers…instead, they should pay to publish papers.

SciPost itself doesn’t go into detail about this, but despite how weird it sounds when put like I just did, there is a difference. Charging a fee to publish means that anyone who publishes needs to pay a fee. If you’re working in a developing country on a shoestring budget, too bad, you have to pay the fee. If you’re an amateur mathematician who works in a truck stop and just puzzled through something amazing, too bad, you have to pay the fee.

Instead of charging a fee, SciPost asks for support. I have to think that part of the reason is that they want some free riders. There are some people who would absolutely not be able to participate in science without free riding, and we want their input nonetheless. That means to support them, others need to give more. It means organizations need to think about SciPost not as just another fee, but as a way they can support the scientific process as a whole.

That’s how other things work, like the arXiv. They get support from big universities and organizations and philanthropists, not from literally everyone. It seems a bit weird to do that for a single scientific journal among many, though, which I suspect is part of why institutions are reluctant to do it. But for a journal that can save money like SciPost, maybe it’s worth it.

Post on the Weak Gravity Conjecture for FirstPrinciples.org

I have another piece this week on the FirstPrinciples.org Hub. If you’d like to know who they are, I say a bit about my impressions of them in my post on the last piece I had there. They’re still finding their niche, so there may be shifts in the kind of content they cover over time, but for now they’ve given me an opportunity to cover a few topics that are off the beaten path.

This time, the piece is what we in the journalism biz call an “explainer”. Instead of interviewing people about cutting-edge science, I wrote a piece to explain an older idea. It’s an idea that’s pretty cool, in a way I think a lot of people can actually understand: a black hole puzzle that might explain why gravity is the weakest force. It’s an idea that’s had an enormous influence, both in the string theory world where it originated and on people speculating more broadly about the rules of quantum gravity. If you want to learn more, read the piece!

Since I didn’t interview anyone for this piece, I don’t have the same sort of “bonus content” I sometimes give. Instead of interviewing, I brushed up on the topic, and the best resource I found was this review article written by Dan Harlow, Ben Heidenreich, Matthew Reece, and Tom Rudelius. It gave me a much better idea of the subtleties: how many different ways there are to interpret the original conjecture, and how different attempts to build on it reflect on different facets and highlight different implications. If you are a physicist curious what the whole thing is about, I recommend reading that review: while I try to give a flavor of some of the subtleties, a piece for a broad audience can only do so much.

There Is No Shortcut to Saying What You Mean

Blogger Andrew Oh-Willeke of Dispatches from Turtle Island pointed me to an editorial in Science about the phrase scientific consensus.

The editorial argues that by referring to conclusions like the existence of climate change or vaccine safety as “the scientific consensus”, communicators have inadvertently fanned the flames of distrust. By emphasizing agreement between scientists, the phrase “scientific consensus” leaves open the question of how that consensus was reached. More conspiracy-minded people imagine shady backroom deals and corrupt payouts, while the more realistic blame incentives and groupthink. If you disagree with “the scientific consensus”, you may thus decide the best way forward is to silence those pesky scientists.

(The link to current events is left as an exercise to the reader, to comment on elsewhere. As usual, please no explicit discussion of politics on this blog!)

Instead of “scientific consensus”, the editorial suggests another term, convergence of evidence. The idea is that by centering the evidence instead of the scientists, the phrase would make it clear that these conclusions are justified by something more than social pressures, and will remain even if the scientists promoting them are silenced.

Oh-Willeke pointed me to another blog post responding to the editorial, which has a nice discussion of how the terms were used historically, showing their popularity over time. “Convergence of evidence” was more popular in the 1950’s, with a small surge in the late 90’s and early 2000’s. “Scientific consensus” rose in the 1980’s and 90’s, lining up with a time when social scientists were skeptical about science’s objectivity and wanted to explore the social reasons why scientists come to agreement. It then fell around the year 2000, before rising again, this time used instead by professional groups of scientists to emphasize their agreement on issues like climate change.

(The blog post then goes on to try to motivate the word “consilience” instead, on the rather thin basis that “convergence of evidence” isn’t interdisciplinary enough, which seems like a pretty silly objection. “Convergence” implies coming in from multiple directions, it’s already interdisciplinary!)

I appreciate “convergence of evidence”, it seems like a useful phrase. But I think the editorial is working from the wrong perspective, in trying to argue for which terms “we should use” in the first place.

Sometimes, as a scientist or an organization or a journalist, you want to emphasize evidence. Is it “a preponderance of evidence”, most but not all? Is it “overwhelming evidence”, evidence so powerful it is unlikely to ever be defeated? Or is it a “convergence of evidence”, evidence that came in slowly from multiple paths, each independent route making a coincidence that much less likely?

But sometimes, you want to emphasize the judgement of the scientists themselves.

Sometimes when scientists agree, they’re working not from evidence but from personal experience: feelings of which kinds of research pan out and which don’t, or shared philosophies that sit deep in how they conceive their discipline. Describing physicists’ reasons for expecting supersymmetry before the LHC turned on as a convergence of evidence would be inaccurate. Describing it as having been a (not unanimous) consensus gets much closer to the truth.

Sometimes, scientists do have evidence, but as a journalist, you can’t evaluate its strength. You note some controversy, you can follow some of the arguments, but ultimately you have to be honest about how you got the information. And sometimes, that will be because it’s what most of the responsible scientists you talked to agreed on: scientific consensus.

As science communicators, we care about telling the truth (as much as we ever can, at any rate). As a result, we cannot adopt blanket rules of thumb. We cannot say, “we as a community are using this term now”. The only responsible thing we can do is to think about each individual word. We need to decide what we actually mean, to read widely and learn from experience, to find which words express our case in a way that is both convincing and accurate. There’s no shortcut to that, no formula where you just “use the right words” and everything turns out fine. You have to do the work, and hope it’s enough.

Experiments Should Be Surprising, but Not Too Surprising

People are talking about colliders again.

This year, the European particle physics community is updating its shared plan for the future, the European Strategy for Particle Physics. A raft of proposals at the end of March stirred up a tail of public debate, focused on asking what sort of new particle collider should be built, and discussing potential reasons why.

That discussion, in turn, has got me thinking about experiments, and how they’re justified.

The purpose of experiments, and of science in general, is to learn something new. The more sure we are of something, the less reason there is to test it. Scientists don’t check whether the Sun rises every day. Like everyone else, they assume it will rise, and use that knowledge to learn other things.

You want your experiment to surprise you. But to design an experiment to surprise you, you run into a contradiction.

Suppose that every morning, you check whether the Sun rises. If it doesn’t, you will really be surprised! You’ll have made the discovery of the century! That’s a really exciting payoff, grant agencies should be lining up to pay for…

Well, is that actually likely to happen, though?

The same reasons it would be surprising if the Sun stopped rising are reasons why we shouldn’t expect the Sun to stop rising. A sunrise-checking observatory has incredibly high potential scientific reward…but an absurdly low chance of giving that reward.

Ok, so you can re-frame your experiment. You’re not hoping the Sun won’t rise, you’re observing the sunrise. You expect it to rise, almost guaranteed, so your experiment has an almost guaranteed payoff.

But what a small payoff! You saw exactly what you expected, there’s no science in that!

By either criterion, the “does the Sun rise” observatory is a stupid experiment. Real experiments operate in between the two extremes. They also mix motivations. Together, that leads to some interesting tensions.

What was the purpose of the Large Hadron Collider?

There were a few things physicists were pretty sure of, when they planned the LHC. Previous colliders had measured W bosons and Z bosons, and their properties made it clear that something was missing. If you could collide protons with enough energy, physicists were pretty sure you’d see the missing piece. Physicists had a reasonably plausible story for that missing piece, in the form of the Higgs boson. So physicists could be pretty sure they’d see something, and reasonably sure it would be the Higgs boson.

If physicists expected the Higgs boson, what was the point of the experiment?

First, physicists expected to see the Higgs boson, but they didn’t expect it to have the mass that it did. In fact, they didn’t know anything about the particle’s mass, besides that it should be low enough that the collider could produce it, and high enough that it hadn’t been detected before. The specific number? That was a surprise, and an almost-inevitable one. A rare creature, an almost-guaranteed scientific payoff.

I say almost, because there was a second point. The Higgs boson didn’t have to be there. In fact, it didn’t have to exist at all. There was a much bigger potential payoff, of noticing something very strange, something much more complicated than the straightforward theory most physicists had expected.

(Many people also argued for another almost-guaranteed payoff, and that got a lot more press. People talked about finding the origin of dark matter by discovering supersymmetric particles, which they argued was almost guaranteed due to a principle called naturalness. This is very important for understanding the history…but it’s an argument that many people feel has failed, and that isn’t showing up much anymore. So for this post, I’ll leave it to the side.)

This mix, of a guaranteed small surprise and the potential for a very large surprise, was a big part of what made the LHC make sense. The mix has changed a bit for people considering a new collider, and it’s making for a rougher conversation.

Like the LHC, most of the new collider proposals have a guaranteed payoff. The LHC could measure the mass of the Higgs, these new colliders will measure its “couplings”: how strongly it influences other particles and forces.

Unlike the LHC, though, this guarantee is not a guaranteed surprise. Before building the LHC, we did not know the mass of the Higgs, and we could not predict it. On the other hand, now we absolutely can predict the couplings of the Higgs. We have quite precise numbers, our expectation for what they should be based on a theory that so far has proven quite successful.

We aren’t certain, of course, just like physicists weren’t certain before. The Higgs boson might have many surprising properties, things that contradict our current best theory and usher in something new. These surprises could genuinely tell us something about some of the big questions, from the nature of dark matter to the universe’s balance of matter and antimatter to the stability of the laws of physics.

But of course, they also might not. We no longer have that rare creature, a guaranteed mild surprise, to hedge in case the big surprises fail. We have guaranteed observations, and experimenters will happily tell you about them…but no guaranteed surprises.

That’s a strange position to be in. And I’m not sure physicists have figured out what to do about it.