Monthly Archives: July 2023

It’s Only a Model

Last week, I said that the current best estimate for the age of the universe, 13.8 billion years old, is based on a mathematical model. In order to get that number, astronomers had to assume the universe evolved in a particular way, according to a model where the universe is composed of ordinary matter, dark matter, and dark energy. In other words, the age of the universe is a model-dependent statement.

Reading that, you might ask whether we can do better. What about a model-independent measurement of the age of the universe?

As intuitive as it might seem, we can’t actually do that. In fact, if we’re really strict about it, we can’t get a model-independent measurement of anything at all. Everything is based on a model.

Imagine stepping on your bathroom scale, getting a mass in kilograms. The number it gives you seems as objective as anything. But to get that number, you have to trust that a number of models are true. You have to model gravity, to assume that the scale’s measurement of your weight gives you the right mass based on the Earth’s surface gravity being approximately constant. You have to model the circuits and sensors in the scale, and be confident that you understand how they’re supposed to work. You have to model people: to assume that the company that made the scale tested it accurately, and that the people who sold it to you didn’t lie about where it came from. And finally, you have to model error: you know that the scale can’t possibly give you your exact weight, so you need a rough idea of just how far off it can reasonably be.

Everything we know is like this. Every measurement in science builds on past science, on our understanding of our measuring equipment and our trust in others. Everything in our daily lives comes through a network of assumptions about the world around us. Everything we perceive is filtered through instincts, our understanding of our own senses and knowledge of when they do and don’t trick us.

Ok, but when I say that the age of the universe is model-dependent, I don’t really mean it like that, right?

Everything we know is model-dependent, but only some model-dependence is worth worrying about. Your knowledge of your bathroom scale comes from centuries-old physics of gravity, widely-applied principles of electronics, and a trust in the function of basic products that serves you well in every other aspect of your life. The models that knowledge depends on aren’t really in question, especially not when you just want to measure your weight.

Some measurements we make in physics are like this too. When the experimental collaborations at the LHC measured the Higgs mass, they were doing something far from routine. But the models they based that measurement on, models of particle physics and particle detector electronics and their own computer code, are still so well-tested that it mostly doesn’t make sense to think of this as a model-dependent measurement. If we’re questioning the Higgs mass, it’s only because we’re questioning something much bigger.

The age of the universe, though, is trickier. Our most precise measurements are based on a specific model: we estimate what the universe is made of and how fast it’s expanding, plug it into our model of how the universe changes over time, and get an estimate for the age. You might suggest that we should just look out into the universe and find the oldest star, but that’s model-dependent too. Stars don’t have rings like trees. Instead, to estimate the age of a star we have to have some model for what kind of light it emits, and for how that light has changed over the history of the universe before it reached us.

These models are not quite as well-established as the models behind particle physics, let alone those behind your bathroom scale. Our models of stars are pretty good, applied to many types of stars in many different galaxies, but they do involve big, complicated systems involving many types of extreme and difficult to estimate physics. Star models get revised all the time, usually in minor ways but occasionally in more dramatic ones. Meanwhile, our model of the whole universe is powerful, but by its very nature much less-tested. We can test it on observations of the whole universe today, or on observations of the whole universe in the past (like the cosmic microwave background). And it works well for these, better than any other model. But it’s not inconceivable, not unrealistic, and above all not out of context, that another model could take its place. And if it did, many of the model-dependent measurements we’ve based on it will have to change.

So that’s why, while everything we know is model-dependent, some are model-dependent in a more important way. Some things, even if we feel they have solid backing, may well turn out to be wrong, in a way that we have reason to take seriously. The age of the universe is pretty well-established as these things go, but it still is one of those types of things, where there is enough doubt in our model that we can’t just take the measurement at face value.

Small Shifts for Specificity

Cosmologists are annoyed at a recent spate of news articles claiming the universe is 26.7 billion years old (rather than 13.8 billion as based on the current best measurements). To some of the science-reading public, the news sounds like a confirmation of hints they’d already heard: about an ancient “Methuselah” star that seemed to be older than the universe (later estimates put it younger), and recent observations from the James Webb Space Telescope of early galaxies that look older than they ought.

“The news doesn’t come from a telescope, though, or a new observation of the sky. Instead, it comes from this press release from the University of Ottawa: “Reinventing cosmology: uOttawa research puts age of universe at 26.7 — not 13.7 — billion years”.

(If you look, you’ll find many websites copying this press release almost word-for-word. This is pretty common in science news, where some websites simply aggregate press releases and others base most of their science news on them rather than paying enough for actual journalism.)

The press release, in turn, is talking about a theory, not an observation. The theorist, Rajendra Gupta, was motivated by examples like the early galaxies observed by JWST and the Methuselah star. Since the 13.8 billion year age of the universe is based on a mathematical model, he tried to find a different mathematical model that led to an older universe. Eventually, by hypothesizing what seems like every unproven physics effect he could think of, he found one that gives a different estimate, 26.7 billion. He probably wasn’t the first person to do this, because coming up with different models to explain odd observations is a standard thing cosmologists do all the time, and until one of the models is shown to explain a wider range of observations (because our best theories explain a lot, so they’re hard to replace), they’re just treated as speculation, not newsworthy science.

This is a pretty clear case of hype, and as such most of the discussion has been about what went wrong. Should we blame the theorist? The university? The journalists? Elon Musk?

Rather than blame, I think it’s more productive to offer advice. And in this situation, the person I think could use some advice is the person who wrote the press release.

So suppose you work for a university, writing their press releases. One day, you hear that one of your professors has done something very cool, something worthy of a press release: they’ve found a new estimate for the age of the universe. What do you do?

One thing you absolutely shouldn’t do is question the science. That just isn’t your job, and even if it were you don’t have the expertise to do that. Anyone who’s hoping that you will only write articles about good science and not bad science is being unrealistic, that’s just not an option.

If you can’t be more accurate, though, you can still be more precise. You can write your article, and in particular your headline, so that you express what you do know as clearly and specifically as possible.

(I’m assuming here you write your own headlines. This is not normal in journalism, where most headlines are written by an editor, not by the writer of a piece. But university press offices are small enough that I’m assuming, perhaps incorrectly, that you can choose how to title your piece.)

Let’s take a look at the title, “Reinventing cosmology: uOttawa research puts age of universe at 26.7 — not 13.7 — billion years”, and see if we can make some small changes to improve it.

One very general word in that title is “research”. Lots of people do research: astronomers do research when they collect observations, theorists do research when they make new models. If you say “research”, some people will think you’re reporting a new observation, a new measurement that gives a radically different age for the universe.

But you know that’s not true, it’s not what the scientist you’re talking to is telling you. So to avoid the misunderstanding, you can get a bit more specific, and replace the word “research” with a more precise one: “Reinventing cosmology: uOttawa theory puts age of universe at 26.7 — not 13.7 — billion years”.

“Theory” is just as familiar a word as “research”. You won’t lose clicks, you won’t confuse people. But now, you’ve closed off a big potential misunderstanding. By a small shift, you’ve gotten a lot clearer. And you didn’t need to question the science to do it!

You can do more small shifts, if you understand a bit more of the science. “Puts” is kind of ambiguous: a theory could put an age somewhere because it computes it from first principles, or because it dialed some parameter to get there. Here, the theory was intentionally chosen to give an older universe, so the title should hint at this in some way. Instead of “puts”, then, you can use “allows”: “Reinventing cosmology: uOttawa theory allows age of universe to be 26.7 — not 13.7 — billion years”.

These kinds of little tricks can be very helpful. If you’re trying to avoid being misunderstood, then it’s good to be as specific as you can, given what you understand. If you do it carefully, you don’t have to question your scientists’ ideas or downplay their contributions. You can do your job, promote your scientists, and still contribute to responsible journalism.

What RIBs Could Look Like

The journal Nature recently published an opinion piece about a new concept for science funding called Research Impact Bonds (or RIBs).

Normally, when a government funds something, they can’t be sure it will work. They pay in advance, and have to guess whether a program will do what they expect, or whether a project will finish on time. Impact bonds are a way for them to pay afterwards, so they only pay for projects that actually deliver. Instead, the projects are funded by private investors, who buy “impact bonds” that guarantee them a share of government funding if the project is successful. Here’s an example given in the Nature piece:

For instance, say the Swiss government promises to pay up to one million Swiss francs (US$1.1 million) to service providers that achieve a measurable outcome, such as reducing illiteracy in a certain population by 5%, within a specified number of years. A broker finds one or more service providers that think they can achieve this at a cost of, say, 900,000 francs, as well as investors who agree to pay these costs up front — thus taking on the risk of the project — for a potential 10% gain if successful. If the providers achieve their goals, the government pays 990,000 francs: 900,000 francs for the work and a 90,000-franc investment return. If the project does not succeed, the investors lose their money, but the government does not.

The author of the piece, Michael Hill, thinks that this could be a new way for governments to fund science. In his model, scientists would apply to the government to propose new RIBs. The projects would have to have specific goals and time-frames: “measure the power of this cancer treatment to this accuracy in five years”, for example. If the government thinks the goal is valuable, they commit to paying some amount of money if the goal is reached. Then investors can decide whether the investment is worthwhile. The projects they expect to work get investor money, and if they do end up working the investors get government money. The government only has to pay if the projects work, but the scientists get paid regardless.

Ok, what’s the catch?

One criticism I’ve seen is that this kind of model could only work for very predictable research, maybe even just for applied research. While the author admits RIBs would only be suitable for certain sorts of projects, I think the range is wider than you might think. The project just has to have a measurable goal by a specified end date. Many particle physics experiments work that way: a dark matter detector, for instance, is trying to either rule out or detect dark matter to a certain level of statistical power within a certain run time. Even “discovery” machines, that we build to try to discover the unexpected, usually have this kind of goal: a bigger version of the LHC, for instance, might try to measure the coupling of Higgs bosons to a certain accuracy.

There are a few bigger issues with this model, though. If you go through the math in Hill’s example, you’ll notice that if the project works, the government ends up paying one million Swiss francs for a service that only cost the provider 900,000 Swiss francs. Under a normal system, the government would only have had to pay 900,000. This gets compensated by the fact that not every project works, so the government only pays for some projects and not others. But investors will be aware of this, and that means the government can’t offer too many unrealistic RIBs: the greater the risk investors are going to take, the more return they’ll expect. On average then, the government would have to pay about as much as they would normally: the cost of the projects that succeed, plus enough money to cover the risk that some fail. (In fact, they’d probably pay a bit more, to give the investors a return on the investment.)

So the government typically won’t save money, at least not if they want to fund the same amount of research. Instead, the idea is that they will avoid risk. But it’s not at all clear to me that the type of risk they avoid is one they want to.

RIBs might appeal to voters: it might sound only fair that a government only funds the research that actually works. That’s not really a problem for the government itself, though: because governments usually pay for many small projects, they still get roughly as much success overall as they want, they just don’t get to pick where. Instead, RIBS put the government agency in a much bigger risk, the risk of unexpected success. As part of offering RIBs, the government would have to estimate how much money they would be able to pay when the projects end. They would want to fund enough projects so that, on average, they pay that amount of money. (Otherwise, they’d end up funding science much less than they do now!) But if the projects work out better than expected, then they’d have to pay much more than they planned. And government science agencies usually can’t do this. In many countries, they can’t plan far in advance at all: their budgets get decided by legislators year to year, and delays in decisions mean delays in funding. If an agency offered RIBs that were more successful than expected, they’d either have to cut funding somewhere else (probably firing a lot of people), or just default on their RIBs, weakening the concept for the next time they used them. These risks, unlike the risk of individual experiments not working, are risks that can really hurt government agencies.

Impact bonds typically have another advantage, in that they spread out decision-making. The Swiss government in Hill’s example doesn’t have to figure out which service providers can increase literacy, or how much it will cost them: it just puts up a budget, and lets investors and service providers figure out if they can make it work. This also serves as a hedge against corruption. If the government made the decisions, they might distribute funding for unrelated political reasons or even out of straight-up bribery. They’d also have to pay evaluators to figure things out. Investors won’t take bribes to lose money, so in theory would be better at choosing projects that will actually work, and would have a vested interest in paying for a good investigation.

This advantage doesn’t apply to Hill’s model of RIBs, though. In Hill’s model, scientists still need to apply to the government to decide which of their projects get offered as RIBs, so the government still needs to decide which projects are worth investing in. Then the scientists or the government need to take another step, and convince investors. The scientists in this equation effectively have to apply twice, which anyone who has applied for a government grant will realize is quite a lot of extra time and effort.

So overall, I don’t think Hills’ model of RIBs is useful, even for the purpose he imagines. It’s too risky for government science agencies to commit to payments like that, and it generates more, not less, work for scientists and the agency.

Hill’s model, though, isn’t the only way RIBs can work. And “avoiding risk” isn’t the only reason we might want them. There are two other reasons one might want RIBs, with very different-sounding motivations.

First, you might be pessimistic about mainstream science. Maybe you think scientists are making bad decisions, choosing ideas that either won’t pan out or won’t have sufficient impact, based more on fashion than on careful thought. You want to incentivize them to do better, to try to work out what impact they might have with some actual numbers and stand by their judgement. If that’s your perspective, you might be interested in RIBs for the same reason other people are interested in prediction markets: by getting investors involved, you have people willing to pay for an accurate estimate.

Second, you might instead be optimistic about mainstream science. You think scientists are doing great work, work that could have an enormous impact, but they don’t get to “capture that value”. Some projects might be essential to important, well-funded goals, but languish unrewarded. Others won’t see their value until long in the future, or will do so in unexpected ways. If scientists could fund projects based on their future impact, with RIBs, maybe they could fund more of this kind of work.

(I first started thinking about this perspective due to a talk by Sabrina Pasterski. The talk itself offended a lot of people, and had some pretty impractical ideas, like selling NFTs of important physics papers. But I think one part of the perspective, that scientists have more impact than we think, is worth holding on to.)

If you have either of those motivations, Hill’s model won’t help. But another kind of model perhaps could. Unlike Hill’s, it could fund much more speculative research, ideas where we don’t know the impact until decades down the line. To demonstrate, I’ll show how it could fund some very speculative research: the work of Peter van Nieuwenhuizen.

Peter van Nieuwenhuizen is one of the pioneers of the theory of supergravity, a theory that augments gravity with supersymmetric partner particles. From its beginnings in the 1970’s, the theory ended up having a major impact on string theory, and today they are largely thought of as part of the same picture of how the universe might work.

His work has, over time, had more practical consequences though. In the 2000’s, researchers working with supergravity noticed a calculational shortcut: they could do a complicated supergravity calculation as the “square” of a much simpler calculation in another theory, called Yang-Mills. Over time, they realized the shortcut worked not just for supergravity, but for ordinary gravity as well, and not just for particle physics calculations but for gravitational wave calculations. Now, their method may make an important contribution to calculations for future gravitational wave telescopes like the Einstein telescope, letting them measure properties of neutron stars.

With that in mind, imagine the following:

In 1967, Jocelyn Bell Burnell and Antony Hewish detected a pulsar, in one of the first direct pieces of evidence for the existence of neutron stars. Suppose that in the early 1970’s NASA decided to announce a future purchase of RIBs: in 2050, they would pay a certain amount to whoever was responsible for finding the equation of state of a neutron star, the formula that describes how its matter moves under pressure. They compute based on estimates of economic growth and inflation, and arrive at some suitably substantial number.

At the same time, but unrelatedly, van Nieuwenhuizen and collaborators sell RIBs. Maybe they use the proceeds to buy more computer time for their calculations, or to refund travel so they can more easily meet and discuss. They tell the buyers that, if some government later decides to reward their discoveries, the holders of the RIB would get a predetermined cut of the rewards.

The years roll by, and barring some unexpected medical advances the discoverers of supergravity die. In the meantime, researchers use their discovery to figure out how to make accurate predictions of gravitational waves from merging neutron stars. When the Einstein telescope turns out, it detects such a merger, and the accurate predictions let them compute the neutron star’s equation of state.

In 2050, then, NASA looks back. They make a list of everyone who contributed to the discovery of the neutron star’s equation of state, every result that was needed for the discovery, and try to estimate how important each contribution was. Then they spend the money they promised buying RIBs, up to the value for each contributor. This includes RIBs originally held by the investors in van Nieuwenhuizen and collaborators. Their current holders make some money, justifying whatever value they paid from their previous owners.

Imagine a world in which government agencies do this kind of thing all the time. Scientists could sell RIBs in their projects, without knowing exactly which agency would ultimately pay for them. Rather than long grant applications, they could write short summaries for investors, guessing at the range of their potential impact, and it would be up to the investors to decide whether the estimate made sense. Scientists could get some of the value of their discoveries, even when that value is quite unpredictable. And they would be incentivized to pick discoveries that could have high impact, and to put a bit of thought and math into what kind of impact that could be.

(Should I still be calling these things bonds, when the buyers don’t know how much they’ll be worth at the end? Probably not. These are more like research impact shares, on a research impact stock market.)

Are there problems with this model, then? Oh sure, loads!

I already mentioned that it’s hard for government agencies to commit to spending money five years down the line. A seventy-year commitment, from that perspective, sounds completely ridiculous.

But we don’t actually need that in the model. All we need is a good reason for investors to think that, eventually, NASA will buy some research impact shares. If government agencies do this regularly, then they would have that reason. They could buy a variety of theoretical developments, a diversified pool to make it more likely some government agency would reward them. This version of the model would be riskier, though, so they’d want more return in exchange.

Another problem is the decision-making aspect. Government agencies wouldn’t have to predict the future, but they would have to accurately assess the past, fairly estimating who contributed to a project, and they would have to do it predictably enough that it could give rise to worthwhile investments. This is itself both controversial and a lot of work. If we figure out the neutron star equation of state, I’m not sure I trust NASA to reward van Nieuwenhuizen’s contribution to it.

This leads to the last modification of the model, and the most speculative one. Over time, government agencies will get better and better at assigning credit. Maybe they’ll have better models of how scientific progress works, maybe they’ll even have advanced AI. A future government (or benevolent AI, if you’re into that) might decide to buy research impact shares in order to validate important past work.

If you believe that might happen, then you don’t need a track record of government agencies buying research impact shares. As a scientist, you can find a sufficiently futuristically inclined investor, and tell them this story. You can sell them some shares, and tell them that, when the AI comes, they will have the right to whatever benefit it bestows upon your research.

I could imagine some people doing this. If you have an image of your work saving humanity in the distant future, you should be able to use that image to sell something to investors. It would be insanely speculative, a giant pile of what-ifs with no guarantee of any of it cashing out. But at least it’s better than NFTs.

Not Made of Photons Either

If you know a bit about quantum physics, you might have heard that everything is made out of particles. Mass comes from Higgs particles, gravity from graviton particles, and light and electricity and magnetism from photon particles. The particles are the “quanta”, the smallest possible units of stuff.

This is not really how quantum physics works.

You might have heard (instead, or in addition), that light is both particle and wave. Maybe you’ve heard it said that it is both at the same time, or that it is one or the other, depending on how you look at it.

This is also not really how quantum physics works.

If you think that light is both a particle and a wave, you might get the impression there are only two options. This is better than thinking there is only one option, but still not really the truth. The truth is there are many options. It all depends on what you measure.

Suppose you have a particle collider, like the Large Hadron Collider at CERN. Sometimes, the particles you collide release photons. You surround the collision with particle detectors. When a photon hits them, these particle detectors amplify it, turning it into an electrical signal in a computer.

If you want to predict what those particle detectors see, you might put together a theory of photons. You’ll try to calculate the chance that you see some specific photon with some specific energy to some reasonable approximation…and you’ll get infinity.

You might think you’ve heard this story before. Maybe you’ve heard people talk about calculations in quantum field theory that give infinity, with buzzwords like divergences and renormalization. You may remember them saying that this is a sign that our theories are incomplete, that there are parameters we can’t predict or that the theory is just a low-energy approximation to a deeper theory.

This is not that story. That story is about “ultraviolet divergences”, infinities that come from high-energy particles. This story is about “infrared divergences” from low-energy particles. Infrared divergences don’t mean our theory is incomplete. Our theory is fine. We’re just using it wrong.

The problem is that I lied to you a little bit, earlier. I told you that your particle detectors can detect photons, so you might have imagined they can detect any photon you like. But that is impossible. A photon’s energy is determined by its wavelength: X-rays have more energy than UV light, which has more energy than IR light, which has more energy than microwaves. No matter how you build your particle detector, there will be some energy low enough that it cannot detect, a wavelength of photons that gives no response at all.

When you think you’re detecting just one photon, then, you’re not actually detecting just one photon. You’re detecting one photon, plus some huge number of undetectable photons that are too low-energy to see. We call these soft photons. You don’t know how many soft photons you generate, because you can’t detect them. Thus, as always in quantum mechanics, you have to add up every possibility.

That adding up is crucial, because it makes the infinite results go away. The different infinities pair up, negative and positive, at each order of approximation. Those pesky infrared divergences aren’t really a problem, provided you’re honest about what you’re actually detecting.

But while infrared divergences aren’t really a problem, they do say something about your model. You were modeling particles as single photons, and that made your calculations complicated, with a lot of un-physical infinite results. But you could, instead, have made another model. You could have modeled particles as dressed photons: one photon, plus a cloud of soft photons.

For a particle physicists, these dressed photons have advantages and disadvantages. They aren’t always the best tool, and can be complicated to use. But one thing they definitely do is avoid infinite results. You can interpret them a little more easily.

That ease, though, raises a question. You started out with a model in which each particle you detect was a photon. You could have imagined it as a model of reality, one in which every electromagnetic field was made up of photons.

But then you found another model, one which sometimes makes more sense. And in that model, instead, you model your particles as dressed photons. You could then once again imagine a model of reality, now with every electromagnetic field made up of dressed photons, not the ordinary ones.

So now it looks like you have three options. Are electromagnetic fields made out of waves, or particles…or dressed particles?

That’s a trick question. It was always a trick question, and will always be a trick question.

Ancient Greek philosophers argued about whether everything was made of water, or fire, or innumerable other things. Now, we teach children that science has found the answer: a world made of atoms, or protons, or quarks.

But scientists are actually answering a different, and much more important, question. “What is everything really made of?” is still a question for philosophers. We scientists want to know what we will observe. We want a model that makes predictions, that tells us what actions we can do and what results we should expect, that lets us develop technology and improve our lives.

And if we want to make those predictions, then our models can make different choices. We can arrange things in different ways, grouping the fluid possibilities of reality into different concrete “stuff”. We can choose what to measure, and how best to describe it. We don’t end up with one “what everything is made of”, but more than one, different stories for different contexts. As long as those models make the right predictions, we’ve done the only job we ever needed to do.