Tag Archives: theoretical physics

Small Shifts for Specificity

Cosmologists are annoyed at a recent spate of news articles claiming the universe is 26.7 billion years old (rather than 13.8 billion as based on the current best measurements). To some of the science-reading public, the news sounds like a confirmation of hints they’d already heard: about an ancient “Methuselah” star that seemed to be older than the universe (later estimates put it younger), and recent observations from the James Webb Space Telescope of early galaxies that look older than they ought.

“The news doesn’t come from a telescope, though, or a new observation of the sky. Instead, it comes from this press release from the University of Ottawa: “Reinventing cosmology: uOttawa research puts age of universe at 26.7 — not 13.7 — billion years”.

(If you look, you’ll find many websites copying this press release almost word-for-word. This is pretty common in science news, where some websites simply aggregate press releases and others base most of their science news on them rather than paying enough for actual journalism.)

The press release, in turn, is talking about a theory, not an observation. The theorist, Rajendra Gupta, was motivated by examples like the early galaxies observed by JWST and the Methuselah star. Since the 13.8 billion year age of the universe is based on a mathematical model, he tried to find a different mathematical model that led to an older universe. Eventually, by hypothesizing what seems like every unproven physics effect he could think of, he found one that gives a different estimate, 26.7 billion. He probably wasn’t the first person to do this, because coming up with different models to explain odd observations is a standard thing cosmologists do all the time, and until one of the models is shown to explain a wider range of observations (because our best theories explain a lot, so they’re hard to replace), they’re just treated as speculation, not newsworthy science.

This is a pretty clear case of hype, and as such most of the discussion has been about what went wrong. Should we blame the theorist? The university? The journalists? Elon Musk?

Rather than blame, I think it’s more productive to offer advice. And in this situation, the person I think could use some advice is the person who wrote the press release.

So suppose you work for a university, writing their press releases. One day, you hear that one of your professors has done something very cool, something worthy of a press release: they’ve found a new estimate for the age of the universe. What do you do?

One thing you absolutely shouldn’t do is question the science. That just isn’t your job, and even if it were you don’t have the expertise to do that. Anyone who’s hoping that you will only write articles about good science and not bad science is being unrealistic, that’s just not an option.

If you can’t be more accurate, though, you can still be more precise. You can write your article, and in particular your headline, so that you express what you do know as clearly and specifically as possible.

(I’m assuming here you write your own headlines. This is not normal in journalism, where most headlines are written by an editor, not by the writer of a piece. But university press offices are small enough that I’m assuming, perhaps incorrectly, that you can choose how to title your piece.)

Let’s take a look at the title, “Reinventing cosmology: uOttawa research puts age of universe at 26.7 — not 13.7 — billion years”, and see if we can make some small changes to improve it.

One very general word in that title is “research”. Lots of people do research: astronomers do research when they collect observations, theorists do research when they make new models. If you say “research”, some people will think you’re reporting a new observation, a new measurement that gives a radically different age for the universe.

But you know that’s not true, it’s not what the scientist you’re talking to is telling you. So to avoid the misunderstanding, you can get a bit more specific, and replace the word “research” with a more precise one: “Reinventing cosmology: uOttawa theory puts age of universe at 26.7 — not 13.7 — billion years”.

“Theory” is just as familiar a word as “research”. You won’t lose clicks, you won’t confuse people. But now, you’ve closed off a big potential misunderstanding. By a small shift, you’ve gotten a lot clearer. And you didn’t need to question the science to do it!

You can do more small shifts, if you understand a bit more of the science. “Puts” is kind of ambiguous: a theory could put an age somewhere because it computes it from first principles, or because it dialed some parameter to get there. Here, the theory was intentionally chosen to give an older universe, so the title should hint at this in some way. Instead of “puts”, then, you can use “allows”: “Reinventing cosmology: uOttawa theory allows age of universe to be 26.7 — not 13.7 — billion years”.

These kinds of little tricks can be very helpful. If you’re trying to avoid being misunderstood, then it’s good to be as specific as you can, given what you understand. If you do it carefully, you don’t have to question your scientists’ ideas or downplay their contributions. You can do your job, promote your scientists, and still contribute to responsible journalism.

What RIBs Could Look Like

Leave a reply

The journal Nature recently published an opinion piece about a new concept for science funding called Research Impact Bonds (or RIBs).

Normally, when a government funds something, they can’t be sure it will work. They pay in advance, and have to guess whether a program will do what they expect, or whether a project will finish on time. Impact bonds are a way for them to pay afterwards, so they only pay for projects that actually deliver. Instead, the projects are funded by private investors, who buy “impact bonds” that guarantee them a share of government funding if the project is successful. Here’s an example given in the Nature piece:

For instance, say the Swiss government promises to pay up to one million Swiss francs (US$1.1 million) to service providers that achieve a measurable outcome, such as reducing illiteracy in a certain population by 5%, within a specified number of years. A broker finds one or more service providers that think they can achieve this at a cost of, say, 900,000 francs, as well as investors who agree to pay these costs up front — thus taking on the risk of the project — for a potential 10% gain if successful. If the providers achieve their goals, the government pays 990,000 francs: 900,000 francs for the work and a 90,000-franc investment return. If the project does not succeed, the investors lose their money, but the government does not.

The author of the piece, Michael Hill, thinks that this could be a new way for governments to fund science. In his model, scientists would apply to the government to propose new RIBs. The projects would have to have specific goals and time-frames: “measure the power of this cancer treatment to this accuracy in five years”, for example. If the government thinks the goal is valuable, they commit to paying some amount of money if the goal is reached. Then investors can decide whether the investment is worthwhile. The projects they expect to work get investor money, and if they do end up working the investors get government money. The government only has to pay if the projects work, but the scientists get paid regardless.

Ok, what’s the catch?

One criticism I’ve seen is that this kind of model could only work for very predictable research, maybe even just for applied research. While the author admits RIBs would only be suitable for certain sorts of projects, I think the range is wider than you might think. The project just has to have a measurable goal by a specified end date. Many particle physics experiments work that way: a dark matter detector, for instance, is trying to either rule out or detect dark matter to a certain level of statistical power within a certain run time. Even “discovery” machines, that we build to try to discover the unexpected, usually have this kind of goal: a bigger version of the LHC, for instance, might try to measure the coupling of Higgs bosons to a certain accuracy.

There are a few bigger issues with this model, though. If you go through the math in Hill’s example, you’ll notice that if the project works, the government ends up paying one million Swiss francs for a service that only cost the provider 900,000 Swiss francs. Under a normal system, the government would only have had to pay 900,000. This gets compensated by the fact that not every project works, so the government only pays for some projects and not others. But investors will be aware of this, and that means the government can’t offer too many unrealistic RIBs: the greater the risk investors are going to take, the more return they’ll expect. On average then, the government would have to pay about as much as they would normally: the cost of the projects that succeed, plus enough money to cover the risk that some fail. (In fact, they’d probably pay a bit more, to give the investors a return on the investment.)

So the government typically won’t save money, at least not if they want to fund the same amount of research. Instead, the idea is that they will avoid risk. But it’s not at all clear to me that the type of risk they avoid is one they want to.

RIBs might appeal to voters: it might sound only fair that a government only funds the research that actually works. That’s not really a problem for the government itself, though: because governments usually pay for many small projects, they still get roughly as much success overall as they want, they just don’t get to pick where. Instead, RIBS put the government agency in a much bigger risk, the risk of unexpected success. As part of offering RIBs, the government would have to estimate how much money they would be able to pay when the projects end. They would want to fund enough projects so that, on average, they pay that amount of money. (Otherwise, they’d end up funding science much less than they do now!) But if the projects work out better than expected, then they’d have to pay much more than they planned. And government science agencies usually can’t do this. In many countries, they can’t plan far in advance at all: their budgets get decided by legislators year to year, and delays in decisions mean delays in funding. If an agency offered RIBs that were more successful than expected, they’d either have to cut funding somewhere else (probably firing a lot of people), or just default on their RIBs, weakening the concept for the next time they used them. These risks, unlike the risk of individual experiments not working, are risks that can really hurt government agencies.

Impact bonds typically have another advantage, in that they spread out decision-making. The Swiss government in Hill’s example doesn’t have to figure out which service providers can increase literacy, or how much it will cost them: it just puts up a budget, and lets investors and service providers figure out if they can make it work. This also serves as a hedge against corruption. If the government made the decisions, they might distribute funding for unrelated political reasons or even out of straight-up bribery. They’d also have to pay evaluators to figure things out. Investors won’t take bribes to lose money, so in theory would be better at choosing projects that will actually work, and would have a vested interest in paying for a good investigation.

This advantage doesn’t apply to Hill’s model of RIBs, though. In Hill’s model, scientists still need to apply to the government to decide which of their projects get offered as RIBs, so the government still needs to decide which projects are worth investing in. Then the scientists or the government need to take another step, and convince investors. The scientists in this equation effectively have to apply twice, which anyone who has applied for a government grant will realize is quite a lot of extra time and effort.

So overall, I don’t think Hills’ model of RIBs is useful, even for the purpose he imagines. It’s too risky for government science agencies to commit to payments like that, and it generates more, not less, work for scientists and the agency.

Hill’s model, though, isn’t the only way RIBs can work. And “avoiding risk” isn’t the only reason we might want them. There are two other reasons one might want RIBs, with very different-sounding motivations.

First, you might be pessimistic about mainstream science. Maybe you think scientists are making bad decisions, choosing ideas that either won’t pan out or won’t have sufficient impact, based more on fashion than on careful thought. You want to incentivize them to do better, to try to work out what impact they might have with some actual numbers and stand by their judgement. If that’s your perspective, you might be interested in RIBs for the same reason other people are interested in prediction markets: by getting investors involved, you have people willing to pay for an accurate estimate.

Second, you might instead be optimistic about mainstream science. You think scientists are doing great work, work that could have an enormous impact, but they don’t get to “capture that value”. Some projects might be essential to important, well-funded goals, but languish unrewarded. Others won’t see their value until long in the future, or will do so in unexpected ways. If scientists could fund projects based on their future impact, with RIBs, maybe they could fund more of this kind of work.

(I first started thinking about this perspective due to a talk by Sabrina Pasterski. The talk itself offended a lot of people, and had some pretty impractical ideas, like selling NFTs of important physics papers. But I think one part of the perspective, that scientists have more impact than we think, is worth holding on to.)

If you have either of those motivations, Hill’s model won’t help. But another kind of model perhaps could. Unlike Hill’s, it could fund much more speculative research, ideas where we don’t know the impact until decades down the line. To demonstrate, I’ll show how it could fund some very speculative research: the work of Peter van Nieuwenhuizen.

Peter van Nieuwenhuizen is one of the pioneers of the theory of supergravity, a theory that augments gravity with supersymmetric partner particles. From its beginnings in the 1970’s, the theory ended up having a major impact on string theory, and today they are largely thought of as part of the same picture of how the universe might work.

His work has, over time, had more practical consequences though. In the 2000’s, researchers working with supergravity noticed a calculational shortcut: they could do a complicated supergravity calculation as the “square” of a much simpler calculation in another theory, called Yang-Mills. Over time, they realized the shortcut worked not just for supergravity, but for ordinary gravity as well, and not just for particle physics calculations but for gravitational wave calculations. Now, their method may make an important contribution to calculations for future gravitational wave telescopes like the Einstein telescope, letting them measure properties of neutron stars.

With that in mind, imagine the following:

In 1967, Jocelyn Bell Burnell and Antony Hewish detected a pulsar, in one of the first direct pieces of evidence for the existence of neutron stars. Suppose that in the early 1970’s NASA decided to announce a future purchase of RIBs: in 2050, they would pay a certain amount to whoever was responsible for finding the equation of state of a neutron star, the formula that describes how its matter moves under pressure. They compute based on estimates of economic growth and inflation, and arrive at some suitably substantial number.

At the same time, but unrelatedly, van Nieuwenhuizen and collaborators sell RIBs. Maybe they use the proceeds to buy more computer time for their calculations, or to refund travel so they can more easily meet and discuss. They tell the buyers that, if some government later decides to reward their discoveries, the holders of the RIB would get a predetermined cut of the rewards.

The years roll by, and barring some unexpected medical advances the discoverers of supergravity die. In the meantime, researchers use their discovery to figure out how to make accurate predictions of gravitational waves from merging neutron stars. When the Einstein telescope turns out, it detects such a merger, and the accurate predictions let them compute the neutron star’s equation of state.

In 2050, then, NASA looks back. They make a list of everyone who contributed to the discovery of the neutron star’s equation of state, every result that was needed for the discovery, and try to estimate how important each contribution was. Then they spend the money they promised buying RIBs, up to the value for each contributor. This includes RIBs originally held by the investors in van Nieuwenhuizen and collaborators. Their current holders make some money, justifying whatever value they paid from their previous owners.

Imagine a world in which government agencies do this kind of thing all the time. Scientists could sell RIBs in their projects, without knowing exactly which agency would ultimately pay for them. Rather than long grant applications, they could write short summaries for investors, guessing at the range of their potential impact, and it would be up to the investors to decide whether the estimate made sense. Scientists could get some of the value of their discoveries, even when that value is quite unpredictable. And they would be incentivized to pick discoveries that could have high impact, and to put a bit of thought and math into what kind of impact that could be.

(Should I still be calling these things bonds, when the buyers don’t know how much they’ll be worth at the end? Probably not. These are more like research impact shares, on a research impact stock market.)

Are there problems with this model, then? Oh sure, loads!

I already mentioned that it’s hard for government agencies to commit to spending money five years down the line. A seventy-year commitment, from that perspective, sounds completely ridiculous.

But we don’t actually need that in the model. All we need is a good reason for investors to think that, eventually, NASA will buy some research impact shares. If government agencies do this regularly, then they would have that reason. They could buy a variety of theoretical developments, a diversified pool to make it more likely some government agency would reward them. This version of the model would be riskier, though, so they’d want more return in exchange.

Another problem is the decision-making aspect. Government agencies wouldn’t have to predict the future, but they would have to accurately assess the past, fairly estimating who contributed to a project, and they would have to do it predictably enough that it could give rise to worthwhile investments. This is itself both controversial and a lot of work. If we figure out the neutron star equation of state, I’m not sure I trust NASA to reward van Nieuwenhuizen’s contribution to it.

This leads to the last modification of the model, and the most speculative one. Over time, government agencies will get better and better at assigning credit. Maybe they’ll have better models of how scientific progress works, maybe they’ll even have advanced AI. A future government (or benevolent AI, if you’re into that) might decide to buy research impact shares in order to validate important past work.

If you believe that might happen, then you don’t need a track record of government agencies buying research impact shares. As a scientist, you can find a sufficiently futuristically inclined investor, and tell them this story. You can sell them some shares, and tell them that, when the AI comes, they will have the right to whatever benefit it bestows upon your research.

I could imagine some people doing this. If you have an image of your work saving humanity in the distant future, you should be able to use that image to sell something to investors. It would be insanely speculative, a giant pile of what-ifs with no guarantee of any of it cashing out. But at least it’s better than NFTs.

Not Made of Photons Either

10 Replies

If you know a bit about quantum physics, you might have heard that everything is made out of particles. Mass comes from Higgs particles, gravity from graviton particles, and light and electricity and magnetism from photon particles. The particles are the “quanta”, the smallest possible units of stuff.

This is not really how quantum physics works.

You might have heard (instead, or in addition), that light is both particle and wave. Maybe you’ve heard it said that it is both at the same time, or that it is one or the other, depending on how you look at it.

This is also not really how quantum physics works.

If you think that light is both a particle and a wave, you might get the impression there are only two options. This is better than thinking there is only one option, but still not really the truth. The truth is there are many options. It all depends on what you measure.

Suppose you have a particle collider, like the Large Hadron Collider at CERN. Sometimes, the particles you collide release photons. You surround the collision with particle detectors. When a photon hits them, these particle detectors amplify it, turning it into an electrical signal in a computer.

If you want to predict what those particle detectors see, you might put together a theory of photons. You’ll try to calculate the chance that you see some specific photon with some specific energy to some reasonable approximation…and you’ll get infinity.

You might think you’ve heard this story before. Maybe you’ve heard people talk about calculations in quantum field theory that give infinity, with buzzwords like divergences and renormalization. You may remember them saying that this is a sign that our theories are incomplete, that there are parameters we can’t predict or that the theory is just a low-energy approximation to a deeper theory.

This is not that story. That story is about “ultraviolet divergences”, infinities that come from high-energy particles. This story is about “infrared divergences” from low-energy particles. Infrared divergences don’t mean our theory is incomplete. Our theory is fine. We’re just using it wrong.

The problem is that I lied to you a little bit, earlier. I told you that your particle detectors can detect photons, so you might have imagined they can detect any photon you like. But that is impossible. A photon’s energy is determined by its wavelength: X-rays have more energy than UV light, which has more energy than IR light, which has more energy than microwaves. No matter how you build your particle detector, there will be some energy low enough that it cannot detect, a wavelength of photons that gives no response at all.

When you think you’re detecting just one photon, then, you’re not actually detecting just one photon. You’re detecting one photon, plus some huge number of undetectable photons that are too low-energy to see. We call these soft photons. You don’t know how many soft photons you generate, because you can’t detect them. Thus, as always in quantum mechanics, you have to add up every possibility.

That adding up is crucial, because it makes the infinite results go away. The different infinities pair up, negative and positive, at each order of approximation. Those pesky infrared divergences aren’t really a problem, provided you’re honest about what you’re actually detecting.

But while infrared divergences aren’t really a problem, they do say something about your model. You were modeling particles as single photons, and that made your calculations complicated, with a lot of un-physical infinite results. But you could, instead, have made another model. You could have modeled particles as dressed photons: one photon, plus a cloud of soft photons.

For a particle physicists, these dressed photons have advantages and disadvantages. They aren’t always the best tool, and can be complicated to use. But one thing they definitely do is avoid infinite results. You can interpret them a little more easily.

That ease, though, raises a question. You started out with a model in which each particle you detect was a photon. You could have imagined it as a model of reality, one in which every electromagnetic field was made up of photons.

But then you found another model, one which sometimes makes more sense. And in that model, instead, you model your particles as dressed photons. You could then once again imagine a model of reality, now with every electromagnetic field made up of dressed photons, not the ordinary ones.

So now it looks like you have three options. Are electromagnetic fields made out of waves, or particles…or dressed particles?

That’s a trick question. It was always a trick question, and will always be a trick question.

Ancient Greek philosophers argued about whether everything was made of water, or fire, or innumerable other things. Now, we teach children that science has found the answer: a world made of atoms, or protons, or quarks.

But scientists are actually answering a different, and much more important, question. “What is everything really made of?” is still a question for philosophers. We scientists want to know what we will observe. We want a model that makes predictions, that tells us what actions we can do and what results we should expect, that lets us develop technology and improve our lives.

And if we want to make those predictions, then our models can make different choices. We can arrange things in different ways, grouping the fluid possibilities of reality into different concrete “stuff”. We can choose what to measure, and how best to describe it. We don’t end up with one “what everything is made of”, but more than one, different stories for different contexts. As long as those models make the right predictions, we’ve done the only job we ever needed to do.

Cabinet of Curiosities: The Deluxe Train Set

Leave a reply

I’ve got a new paper out this week with Andrew McLeod. I’m thinking of it as another entry in this year’s “cabinet of curiosities”, interesting Feynman diagrams with unusual properties. Although this one might be hard to fit into a cabinet.

Over the past few years, I’ve been finding Feynman diagrams with interesting connections to Calabi-Yau manifolds, the spaces originally studied by string theorists to roll up their extra dimensions. With Andrew and other collaborators, I found an interesting family of these diagrams called traintracks, which involve higher-and-higher dimensional manifolds as they get longer and longer.

This time, we started hooking up our traintracks together.

We call diagrams like these traintrack network diagrams, or traintrack networks for short. The original traintracks just went “one way”: one family, going higher in Calabi-Yau dimension the longer they got. These networks branch out, one traintrack leading to another and another.

In principle, these are much more complicated diagrams. But we find we can work with them in almost the same way. We can find the same “starting point” we had for the original traintracks, the set of integrals used to find the Calabi-Yau manifold. We’ve even got more reliable tricks, a method recently honed by some friends of ours that consistently find a Calabi-Yau manifold inside the original traintracks.

Surprisingly, though, this isn’t enough.

It works for one type of traintrack network, a so-called “cross diagram” like this:

But for other diagrams, if the network branches any more, the trick stops working. We still get an answer, but that answer is some more general space, not just a Calabi-Yau manifold.

That doesn’t mean that these general traintrack networks don’t involve Calabi-Yaus at all, mind you: it just means this method doesn’t tell us one way or the other. It’s also possible that simpler versions of these diagrams, involving fewer particles, will once again involve Calabi-Yaus. This is the case for some similar diagrams in two dimensions. But it’s starting to raise a question: how special are the Calabi-Yau related diagrams? How general do we expect them to be?

Another fun thing we noticed has to do with differential equations. There are equations that relate one diagram to another simpler one. We’ve used them in the past to build up “ladders” of diagrams, relating each picture to one with one of its boxes “deleted”. We noticed, playing with these traintrack networks, that these equations do a bit more than we thought. “Deleting” a box can make a traintrack short, but it can also chop a traintrack in half, leaving two “dangling” pieces, one on either side.

This reminded me of an important point, one we occasionally lose track of. The best-studied diagrams related to Calabi-Yaus are called “sunrise” diagrams. If you squish together a loop in one of those diagrams, the whole diagram squishes together, becoming much simpler. Because of that, we’re used to thinking of these as diagrams with a single “geometry”, one that shows up when you don’t “squish” anything.

Traintracks, and traintrack networks, are different. “Squishing” the diagram, or “deleting” a box, gives you a simpler diagram, but not much simpler. In particular, the new diagram will still contain traintracks, and traintrack networks. That means that we really should think of each traintrack network not just as one “top geometry”, but of a collection of geometries, different Calabi-Yaus that break into different combinations of Calabi-Yaus in different ways. It’s something we probably should have anticipated, but the form these networks take is a good reminder, one that points out that we still have a lot to do if we want to understand these diagrams.

Solutions and Solutions

3 Replies

The best misunderstandings are detective stories. You can notice when someone is confused, but digging up why can take some work. If you manage, though, you learn much more than just how to correct the misunderstanding. You learn something about the words you use, and the assumptions you make when using them.

Recently, someone was telling me about a book they’d read on Karl Schwarzschild. Schwarzschild is famous for discovering the equations that describe black holes, based on Einstein’s theory of gravitation. To make the story more dramatic, he did so only shortly before dying from a disease he caught fighting in the first World War. But this person had the impression that Schwarzschild had done even more. According to this person, the book said that Schwarzschild had done something to prove Einstein’s theory, or to complete it.

Another Schwarzschild accomplishment: that mustache

At first, I thought the book this person had read was wrong. But after some investigation, I figured out what happened.

The book said that Schwarzschild had found the first exact solution to Einstein’s equations. That’s true, and as a physicist I know precisely what it means. But I now realize that the average person does not.

In school, the first equations you solve are algebraic, x+y=z. Some equations, like x^2=4, have solutions. Others, like x^2=-4, seem not to, until you learn about new types of numbers that solve them. Either way, you get used to equations being like a kind of puzzle, a question for which you need to find an answer.

If you’re thinking of equations like that, then it probably sounds like Schwarzschild “solved the puzzle”. If Schwarzschild found the first solution to Einstein’s equation, that means that Einstein did not. That makes it sound like Einstein’s work was incomplete, that he had asked the right question but didn’t yet know the right answer.

Einstein’s equations aren’t algebraic equations, though. They’re differential equations. Instead of equations for a variable, they’re equations for a mathematical function, a formula that, in this case, describes the curvature of space and time.

Scientists in many fields use differential equations, but they use them in different ways. If you’re a chemist or a biologist, it might be that you’re most used to differential equations with simple solutions, like sines, cosines, or exponentials. You learn how to solve these equations, and they feel a bit like the algebraic ones: you have a puzzle, and then you solve the puzzle.

Other fields, though, have tougher differential equations. If you’re a physicist or an engineer, you’ve likely met differential equations that you can’t treat in this way. If you’re dealing with fluid mechanics, or general relativity, or even just Newtonian gravity in an odd situation, you can’t usually solve the problem by writing down known functions like sines and cosines.

That doesn’t mean you can’t solve the problem at all, though!

Even if you can’t write down a solution to a differential equation with sines and cosines, a solution can still exist. (In some cases, we can even prove a solution exists!) It just won’t be written in terms of sines and cosines, or other functions you’ve learned in school. Instead, the solution will involve some strange functions, functions no-one has heard of before.

If you want, you can make up names for those functions. But unless you’re going to classify them in a useful way, there’s not much point. Instead, you work with these functions by approximation. You calculate them in a way that doesn’t give you the full answer, but that does let you estimate how close you are. That’s good enough to give you numbers, which in turn is good enough to compare to experiments. With just an approximate solution, like this, Einstein could check if his equations described the orbit of Mercury.

Once you know you can find these approximate solutions, you have a different perspective on equations. An equation isn’t just a mysterious puzzle. If you can approximate the solution, then you already know how to solve that puzzle. So we wouldn’t think of Einstein’s theory as incomplete because he was only able to find approximate solutions: for a theory as complicated as Einstein’s, that’s perfectly normal. Most of the time, that’s all we need.

But it’s still pretty cool when you don’t have to do this. Sometimes, we can not just approximate, but actually “write down” the solution, either using known functions or well-classified new ones. We call a solution like that an analytic solution, or an exact solution.

That’s what Schwarzschild managed. These kinds of exact solutions often only work in special situations, and Schwarzschild’s is no exception. His Schwarzschild solution works for matter in a special situation, arranged in a perfect sphere. If matter happened to be arranged in that way, then the shape of space and time would be exactly as Schwarzschild described it.

That’s actually pretty cool! Einstein’s equations are complicated enough that no-one was sure that there were any solutions like that, even in very special situations. Einstein expected it would be a long time until they could do anything except approximate solutions.

(If Schwarzschild’s solution only describes matter arranged in a perfect sphere, why do we think it describes real black holes? This took later work, by people like Roger Penrose, who figured out that matter compressed far enough will always find a solution like Schwarzschild’s.)

Schwarzschild intended to describe stars with his solution, or at least a kind of imaginary perfect star. What he found was indeed a good approximation to real stars, but also the possibility that a star shoved into a sufficiently small space would become something weird and new, something we would come to describe as a black hole. That’s a pretty impressive accomplishment, especially for someone on the front lines of World War One. And if you know the difference between an exact solution and an approximate one, you have some idea of what kind of accomplishment that is.

Learning for a Living

Leave a reply

It’s a question I’ve now heard several times, in different forms. People hear that I’ll be hired as a researcher at an institute of theoretical physics, and they ask, “what, exactly, are they paying you to research?”

The answer, with some caveats: “Whatever I want.”

When a company hires a researcher, they want to accomplish specific things: to improve their products, to make new ones, to cut down on fraud or out-think the competition. Some government labs are the same: if you work for NIST, for example, your work should contribute in some way to achieving more precise measurements and better standards for technology.

Other government labs, and universities, are different. They pursue basic research, research not on any specific application but on the general principles that govern the world. Researchers doing basic research are given a lot of freedom, and that freedom increases as their careers go on.

As a PhD student, a researcher is a kind of apprentice, working for their advisor. Even then, they have some independence: an advisor may suggest projects, but PhD students usually need to decide how to execute them on their own. In some fields, there can be even more freedom: in theoretical physics, it’s not unusual for the more independent students to collaborate with other people than just their advisor.

Postdocs, in turn, have even more freedom. In some fields they get hired to work on a specific project, but they tend to have more freedom as to how to execute it than a PhD student would. Other fields give them more or less free rein: in theoretical physics, a postdoc will have some guidance, but often will be free to work on whatever they find interesting.

Professors, and other long-term researchers, have the most freedom of all. Over the climb from PhD to postdoc to professor, researchers build judgement, demonstrating a track record for tackling worthwhile scientific problems. Universities, and institutes of basic research, trust that judgement. They hire for that judgement. They give their long-term researchers free reign to investigate whatever questions they think are valuable.

In practice, there are some restrictions. Usually, you’re supposed to research in a particular field: at an institute for theoretical physics, I should probably research theoretical physics. (But that can mean many things: one of my future colleagues studies the science of cities.) Further pressure comes from grant funding, money you need to hire other researchers or buy equipment that can come with restrictions attached. When you apply for a grant, you have to describe what you plan to do. (In practice, grant agencies are more flexible about this than you might expect, allowing all sorts of changes if you have a good reason…but you still can’t completely reinvent yourself.) Your colleagues themselves also have an impact: it’s much easier to work on something when you can walk down the hall and ask an expert when you get stuck. It’s why we seek out colleagues who care about the same big questions as we do.

Overall, though, research is one of the free-est professions there is. If you can get a job learning for a living, and do it well enough, then people will trust your judgement. They’ll set you free to ask your own questions, and seek your own answers.

Enfin, Permanent

10 Replies

My blog began, almost eleven years ago, with the title “Four Gravitons and a Grad Student”. Since then, I finished my PhD. The “Grad Student” dropped from the title, and the mysterious word “postdoc” showed up on a few pages. For three years I worked as a postdoc at the Perimeter Institute in Canada, before hopping the pond and starting another three-year postdoc job in Denmark. With a grant from the EU, three years became four. More funding got me to five (with a fancier title), and now nearing on six. Each step, my contract has been temporary: at first three years at a time, then one-year extensions. Each year I applied, all over the world, looking for a permanent job: for a chance to settle down somewhere, to build my own research group without worrying about having to move the next year.

This year, things have finally worked out. In the Fall I will be moving to France, starting a junior permanent position with L’Institut de Physique Théorique (or IPhT) at CEA Paris-Saclay.

A photo of the entryway to the Institute, taken when I interviewed

It’s been a long journey to get here, with a lot of soul-searching. This year in particular has been a year of reassessment: of digging deep and figuring out what matters to me, what I hope to accomplish and what clues I have to guide the way. Sometimes I feel like I’ve matured more as a physicist in the last year than in the last three put together.

The CEA (originally Commissariat à l’énergie atomique, now Commissariat à l’énergie atomique et aux énergies alternatives, or Alternative Energies and Atomic Energy Commission, and yes that means they’re using the “A” for two things at the same time), is roughly a parallel organization to the USA’s Department of Energy. Both organizations began as a way to manage their nation’s nuclear program, but both branched out, both into other forms of energy and into scientific research. Both run a nationwide network of laboratories, lightly linked but independent from their nations’ universities, both with notable facilities for particle physics. The CEA’s flagship site is in Saclay, on the outskirts of Paris, and it’s their Institute for Theoretical Physics where I’ll be working.

My new position is genuinely permanent: unlike a tenure-track position in the US, I don’t go up for review after a fixed span of time, with the expectation that if I don’t get promoted I lose the job altogether. It’s also not a university, which in particular means I’m not required to teach. I’ll have the option of teaching, working with nearby universities. In the long run, I think I’ll pursue that option. I’ve found teaching helpful the past couple years: it’s helped me think about physics, and think about how to communicate physics. But it’s good not to have to rush into preparing a new course when I arrive, as new professors often do.

It’s also a really great group, with a lot of people who work on things I care about. IPhT has a long track record of research in scattering amplitudes, with many leading figures. They’ve played a key role in topics that frequent readers will have seen show up on this blog: on applying techniques from particle physics to gravitational waves, to the way Calabi-Yau manifolds show up in Feynman diagrams, and even recently to the relationship of machine learning to inference in particle physics.

Working temporary positions year after year, not knowing where I’ll be the next year, has been stressful. Others have had it worse, though. Some of you might have seen a recent post by Bret Deveraux, a military historian with a much more popular blog who has been in a series of adjunct positions. Deveraux describes the job market for the humanities in the US quite well. I’m in theoretical physics in Europe, so while my situation hasn’t been easy, it has been substantially better.

First, there’s the physics component. Physics has “adjunctified” much less than other fields. I don’t think I know a single physicist who has taken an adjunct teaching position, the kind of thing where you’re paid per course and only to teach. I know many who have left physics for other kinds of work, for Wall Street or Silicon Valley or to do data science for a bank or to teach high school. On the other side, I know people in other fields who do work as adjuncts, particularly in mathematics.

Deveraux blames the culture of his field, but I think funding also must have an important role. Physicists, and scientists in many other areas, rarely get professor positions right after their PhDs, but that doesn’t mean they leave the field entirely because most can find postdoc positions. Those postdocs are focused on research, and are often paid for by government grants: in my field in the US, that usually means the Department of Energy. People can go through two or sometimes even three such positions before finding something permanent, if they don’t leave the field before that. Without something like the Department of Energy or National Institutes of Health providing funding, I don’t know if the humanities could imitate that structure even if they wanted to.

Europe, in turn, has a different situation than the US. Most European countries don’t have a tenure-track: just permanent positions and fixed-term positions. Funding also works quite differently. Department of Energy funding in the US is spread widely and lightly: grants are shared by groups of theorists at a given university, each getting funding for a few postdocs and PhDs across the group. In Europe, a lot of the funding is much more concentrated: big grants from the European Research Council going to individual professors, with various national and private grants supplementing or mirroring that structure. That kind of funding, and the rarity of tenure, in turn leads to a different kind of temporary position: one not hired to teach a course but hired for research as long as the funding lasts. The Danish word for my current title is Adjunkt, but that’s as one says in France a faux ami: the official English translation is Assistant Professor, and it’s nothing like a US adjunct. I know people in a variety of forms of that kind of position in a variety of countries, people who landed a five-year grant where they could act like a professor, hire people and so on, but who in the end were expected to move when the grant was over. It’s a stressful situation, but at least it lets us further our research and make progress, unlike a US adjunct in the humanities or math who needs to spend much of their time on teaching.

I do hope Deveraux finds a permanent position, he’s got a great blog. And to return to the theme of the post, I am extremely grateful and happy that I have managed to find a permanent position. I’m looking forward to joining the group at Saclay: to learning more about physics from them, but also, to having a place where I can start to build something, and make a lasting impact on the world around me.

Bottlenecks, Known and Unknown

Leave a reply

Scientists want to know everything, and we’ve been trying to get there since the dawn of science. So why aren’t we there yet? Why are there things we still don’t know?

Sometimes, the reason is obvious: we can’t do the experiments yet. Victorian London had neither the technology nor the wealth to build a machine like Fermilab, so they couldn’t discover the top quark. Even if Newton had the idea for General Relativity, the telescopes of the era wouldn’t have let astronomers see its effect on the motion of Mercury. As we grow (in technology, in resources, in knowledge, in raw number of human beings), we can test more things and learn more about the world.

But I’m a theoretical physicist, not an experimental physicist. I still want to understand the world, but what I contribute aren’t new experiments, but new ideas and new calculations. This brings back the question in a new form: why are there calculations we haven’t done yet? Why are there ideas we haven’t had yet?

Sometimes, we can track the reason down to bottlenecks. A bottleneck is a step in a calculation that, for some reason, is harder than the rest. As you try to push a calculation to new heights, the bottleneck is the first thing that slows you down, like the way liquid bubbles through the neck of a literal bottle. If you can clear the bottleneck, you can speed up your calculation and accomplish more.

In the clearest cases, we can see how these bottlenecks could be solved with more technology. As computers get faster and more powerful, calculations become possible that weren’t possible before, in the same way new experiments become possible with new equipment. This is essentially what has happened recently with machine learning, where relatively old ideas are finally feasible to apply on a massive scale.

In physics, a subtlety is that we rarely have access to the most powerful computers available. Some types of physics are done on genuine supercomputers, but for more speculative or lower-priority research we have to use small computer clusters, or even our laptops. Something can be a bottleneck not because it can’t be done on any computer, but because it can’t be done on the computers we can afford.

Most of the time, bottlenecks aren’t quite so obvious. That’s because in theoretical physics, often, we don’t know what we want to calculate. If we want to know why something happens, and not merely that it happens, then we need a calculation that we can interpret, that “makes sense” and that thus, hopefully, we can generalize. We might have some ideas for how that calculation could work: some property a mathematical theory might have that we already know how to understand. Some of those ideas are easy to check, so we check, and make progress. Others are harder, and we have to decide: is the calculation worth it, if we don’t know if it will give us the explanation we need?

Those decisions provide new bottlenecks, often hidden ones. As we get better at calculation, the threshold for an “easy” check gets easier and easier to meet. We put aside fewer possibilities, so we notice more things, which inspire yet more ideas. We make more progress, not because the old calculations were impossible, but because they weren’t easy enough, and now they are. Progress fuels progress, a virtuous cycle that gets us closer and closer to understanding everything we want to understand (which is everything).

What’s a Cosmic String?

1 Reply

Nowadays, we have telescopes that detect not just light, but gravitational waves. We’ve already learned quite a bit about astrophysics from these telescopes. They observe ripples coming from colliding black holes, giving us a better idea of what kinds of black holes exist in the universe. But the coolest thing a gravitational wave telescope could discover is something that hasn’t been seen yet: a cosmic string.

This art is from an article in Symmetry magazine which is, as far as I can tell, not actually about cosmic strings.

You might have heard of cosmic strings, but unless you’re a physicist you probably don’t know much about them. They’re a prediction, coming from cosmology, of giant string-like objects floating out in space.

That might sound like it has something to do with string theory, but it doesn’t actually have to, you can have these things without any string theory at all. Instead, you might have heard that cosmic strings are some kind of “cracks” or “wrinkles” in space-time. Some articles describe this as like what happens when ice freezes, cracks forming as water settles into a crystal.

That description, in terms of ice forming cracks between crystals, is great…if you’re a physicist who already knows how ice forms cracks between crystals. If you’re not, I’m guessing reading those kinds of explanations isn’t helpful. I’m guessing you’re still wondering why there ought to be any giant strings floating in space.

The real explanation has to do with a type of mathematical gadget physicists use, called a scalar field. You can think of a scalar field as described by a number, like a temperature, that can vary in space and time. The field carries potential energy, and that energy depends on what the scalar field’s “number” is. Left alone, the field settles into a situation with as little potential energy as it can, like a ball rolling down a hill. That situation is one of the field’s default values, something we call a “vacuum” value. Changing the field away from its vacuum value can take a lot of energy. The Higgs boson is one example of a scalar field. Its vacuum value is the value it has in day to day life. In order to make a detectable Higgs boson at the Large Hadron Collider, they needed to change the field away from its vacuum value, and that took a lot of energy.

In the very early universe, almost back at the Big Bang, the world was famously in a hot dense state. That hot dense state meant that there was a lot of energy to go around, so scalar fields could vary far from their vacuum values, pretty much randomly. As the universe expanded and cooled, there was less and less energy available for these fields, and they started to settle down.

Now, the thing about these default, “vacuum” values of a scalar field is that there doesn’t have to be just one of them. Depending on what kind of mathematical function the field’s potential energy is, there could be several different possibilities each with equal energy.

Let’s imagine a simple example, of a field with two vacuum values: +1 and -1. As the universe cooled down, some parts of the universe would end up with that scalar field number equal to +1, and some to -1. But what happens in between?

The scalar field can’t just jump from -1 to +1, that’s not allowed in physics. It has to pass through 0 in between. But, unlike -1 and +1, 0 is not a vacuum value. When the scalar field number is equal to 0, the field has more energy than it does when it’s equal to -1 or +1. Usually, a lot more energy.

That means the region of scalar field number 0 can’t spread very far: the further it spreads, the more energy it takes to keep it that way. On the other hand, the region can’t vanish altogether: something needs to happen to transition between the numbers -1 and +1.

The thing that happens is called a domain wall. A domain wall is a thin sheet, as thin as it can physically be, where the scalar field doesn’t take its vacuum value. You can roughly think of it as made up of the scalar field, a churning zone of the kind of bosons the LHC was trying to detect.

This sheet still has a lot of energy, bound up in the unusual value of the scalar field, like an LHC collision in every proton-sized chunk. As such, like any object with a lot of energy, it has a gravitational field. For a domain wall, the effect of this gravity would be very very dramatic: so dramatic, that we’re pretty sure they’re incredibly rare. If they were at all common, we would have seen evidence of them long before now!

Ok, I’ve shown you a wall, that’s weird, sure. What does that have to do with cosmic strings?

The number representing a scalar field doesn’t have to be a real number: it can be imaginary instead, or even complex. Now I’d like you to imagine a field with vacuum values on the unit circle, in the complex plane. That means that +1 and -1 are still vacuum values, but so are $e^{i \pi/2}$ , and $e^{3 i \pi/2}$ , and everything else you can write as $e^{i\theta}$ . However, 0 is still not a vacuum value. Neither is, for example, $2 e^{i\pi/3}$ .

With vacuum values like this, you can’t form domain walls. You can make a path between -1 and +1 that only goes through the unit circle, through $e^{i \pi/2}$ for example. The field will be at its vacuum value throughout, taking no extra energy.

However, imagine the different regions form a circle. In the picture above, suppose that the blue area at the bottom is at vacuum value -1 and red is at +1. You might have $e^{i \pi/2}$ in the green region, and $e^{3 i \pi/2}$ in the purple region, covering the whole circle smoothly as you go around.

Now, think about what happens in the middle of the circle. On one side of the circle, you have -1. On the other, +1. (Or, on one side $e^{i \pi/2}$ , on the other, $e^{3 i \pi/2}$ ). No matter what, different sides of the circle are not allowed to be next to each other, you can’t just jump between them. So in the very middle of the circle, something else has to happen.

Once again, that something else is a field that goes away from its vacuum value, that passes through 0. Once again, that takes a lot of energy, so it occupies as little space as possible. But now, that space isn’t a giant wall. Instead, it’s a squiggly line: a cosmic string.

Cosmic strings don’t have as dramatic a gravitational effect as domain walls. That means they might not be super-rare. There might be some we haven’t seen yet. And if we do see them, it could be because they wiggle space and time, making gravitational waves.

Cosmic strings don’t require string theory, they come from a much more basic gadget, scalar fields. We know there is one quite important scalar field, the Higgs field. The Higgs vacuum values aren’t like +1 and -1, or like the unit circle, though, so the Higgs by itself won’t make domain walls or cosmic strings. But there are a lot of proposals for scalar fields, things we haven’t discovered but that physicists think might answer lingering questions in particle physics, and some of those could have the right kind of vacuum values to give us cosmic strings. Thus, if we manage to detect cosmic strings, we could learn something about one of those lingering questions.

Building the Railroad to Rigor

Leave a reply

As a kid who watched far too much educational television, I dimly remember learning about the USA’s first transcontinental railroad. Somehow, parts of the story stuck with me. Two companies built the railroad from different directions, one from California and the other from the middle of the country, aiming for a mountain in between. Despite the US Civil War happening around this time, the two companies built through, in the end racing to where the final tracks were laid with a golden spike.

I’m a theoretical physicist, so of course I don’t build railroads. Instead, I build new mathematical methods, ways to check our theories of particle physics faster and more efficiently. Still, something of that picture resonates with me.

You might think someone who develops new mathematical methods would be a mathematician, not a physicist. But while there are mathematicians who work on the problems I work on, their goals are a bit different. They care about rigor, about stating only things they can carefully prove. As such, they often need to work with simplified examples, “toy models” well-suited to the kinds of theorems they can build.

Physicists can be a bit messier. We don’t always insist on the same rigor the mathematicians do. This makes our results less reliable, but it makes our “toy models” a fair amount less “toy”. Our goal is to try to tackle questions closer to the actual real world.

What happens when physicists and mathematicians work on the same problem?

If the physicists worked alone, they might build and build, and end up with an answer that isn’t actually true. The mathematicians, keeping rigor in mind, would be safe in the truth of what they built, but might not end up anywhere near the physicists’ real-world goals.

Together, though, physicists and mathematicians can build towards each other. The physicists can keep their eyes on the mathematicians, correcting when they notice something might go wrong and building more and more rigor into their approach. The mathematicians can keep their eyes on the physicists, building more and more complex applications of their rigorous approaches to get closer and closer to the real world. Eventually, like the transcontinental railroad, the two groups meet: the mathematicians prove a rigorous version of the physicists’ approach, or the physicists adopt the mathematicians’ ideas and apply them to their own theories.

In practice, it isn’t just two teams, physicists and mathematicians, building towards each other. Different physicists themselves work with different levels of rigor, aiming to understand different problems in different theories, and the mathematicians do the same. Each of us is building our own track, watching the other tracks build towards us on the horizon. Eventually, we’ll meet, and science will chug along over what we’ve built.

4 gravitons

Stories about physics from someone who's been there

Tag Archives: theoretical physics

Small Shifts for Specificity

What RIBs Could Look Like

Not Made of Photons Either

Cabinet of Curiosities: The Deluxe Train Set

Solutions and Solutions

Learning for a Living

Enfin, Permanent

Bottlenecks, Known and Unknown

What’s a Cosmic String?

Building the Railroad to Rigor