Tag Archives: PublicPerception

Congratulations to Pierre Agostini, Ferenc Krausz and Anne L’Huillier!

The 2023 Physics Nobel Prize was announced this week, awarded to Pierre Agostini, Ferenc Krausz and Anne L’Huillier for figuring out how to generate extremely fast (hundreds of attoseconds) pulses of light.

Some physicists try to figure out the laws of physics themselves, or the behavior of big photogenic physical systems like stars and galaxies. Those people tend to get a lot of press, but most physicists don’t do that kind of work. Instead, most physicists try to accomplish new things with old physical laws: taking light, electrons, and atoms and doing things nobody thought possible. While that may sound like engineering, the work these physicists do lies beyond the bounds of what engineers are comfortable with: there’s too much uncertainty, too little precedent, and the applications are still far away. The work is done with the goal of pushing our capabilities as far as we can, accomplishing new things and worrying later about what they’re good for.

(Somehow, they still tend to be good for something, often valuable things. Knowing things pays off!)

Anne L’Huillier began the story in 1987, shining infrared lasers through noble gases and seeing the gas emit unexpected new frequencies. As physicists built on that discovery, it went from an academic observation to a more and more useful tool, until in 2001 Pierre Agostini and Ferenc Krausz, with different techniques both based on the same knowledge, managed to produce pulses of light only a few hundred attoseconds long.

(“Atto” is one of the SI prefixes. They go milli, micro, nano, pico, femto, atto. Notice that “nano” is in the middle there: an attosecond is as much smaller than a nanosecond as a nanosecond is from an ordinary second.)

This is cool just from the point of view of “humans doing difficult things”, but it’s also useful. Electrons move on attosecond time-scales. If you can send pulses of light at attosecond speed, you’ve got a camera fast enough to capture how electrons move in real time. You can figure out how they traverse electronics, or how they slosh back and forth in biological molecules.

This year’s prize has an extra point of interest for me, as both Anne L’Huillier and Pierre Agostini did their prize-winning work at CEA Paris-Saclay, where I just started work last month. Their groups would eventually evolve into something called Attolab, I walk by their building every day on the way to lunch.

Stories Backwards and Forwards

You can always start with “once upon a time”…

I come up with tricks to make calculations in particle physics easier. That’s my one-sentence story, or my most common one. If I want to tell a longer story, I have more options.

Here’s one longer story:

I want to figure out what Nature is telling us. I want to take all the data we have access to that has anything to say about fundamental physics, every collider and gravitational wave telescope and ripple in the overall structure of the universe, and squeeze it as hard as I can until something comes out. I want to make sure we understand the implications of our current best theories as well as we can, to as high precision as we can, because I want to know whether they match what we see.

To do that, I am starting with a type of calculation I know how to do best. That’s both because I can make progress with it, and because it will be important for making these inferences, for testing our theories. I am following a hint in a theory that definitely does not describe the real world, one that is both simpler to work with and surprisingly complex, one that has a good track record, both for me and others, for advancing these calculations. And at the end of the day, I’ll make our ability to infer things from Nature that much better.

Here’s another:

Physicists, unknowing, proposed a kind of toy model, one often simpler to work with but not necessarily simpler to describe. Using this model, they pursued increasingly elaborate calculations, and time and time again, those calculations surprised them. The results were not random, not a disorderly mess of everything they could plausibly have gotten. Instead, they had structure, symmetries and patterns and mathematical properties that the physicists can’t seem to explain. If we can explain them, we will advance our knowledge of models and theories and ideas, geometry and combinatorics, learning more about the unexpected consequences of the rules we invent.

We can also help the physicists advance physics, of course. That’s a happy accident, but one that justifies the money and time, showing the rest of the world that understanding consequences of rules is still important and valuable.

These seem like very different stories, but they’re not so different. They change in order, physics then math or math then physics, backwards and forwards. By doing that, they change in emphasis, in where they’re putting glory and how they’re catching your attention. But at the end of the day, I’m investigating mathematical mysteries, and I’m advancing our ability to do precision physics.

(Maybe you think that my motivation must lie with one of these stories and not the other. One is “what I’m really doing”, the other is a lie made up for grant agencies.
Increasingly, I don’t think people work like that. If we are at heart stories, we’re retroactive stories. Our motivation day to day doesn’t follow one neat story or another. We move forward, we maybe have deep values underneath, but our accounts of “why” can and will change depending on context. We’re human, and thus as messy as that word should entail.)

I can tell more than two stories if I want to. I won’t here. But this is largely what I’m working on at the moment. In applying for grants, I need to get the details right, to sprinkle the right references and the right scientific arguments, but the broad story is equally important. I keep shuffling that story, a pile of not-quite-literal index cards, finding different orders and seeing how they sound, imagining my audience and thinking about what stories would work for them.

Why You Might Want to Inspire Kids to Be Physicists (And What Movies You’d Make as a Result)

Since the new Oppenheimer biopic came out, people have been making fun of this tweet by Sam Altman:

Expecting a movie about someone building an immensely destructive weapon, watching it plunge the world into paranoia, then getting mercilessly hounded about it to be an inspiration seems…a bit unrealistic? But everyone has already made that point. What I found more interesting was a blog post a couple days ago by science blogger Chad Orzel. Orzel asks, suppose you did want to make a movie inspiring kids to go into physics: how would you do it? I commented on his post with my own take on the question, then realized it might be nice as a post here.

If you want to inspire kids to go into physics with a movie, what do you do? Well, you can start by asking, why do you want kids to go into physics? Why do you want more physicists?

Maybe you believe that more physicists are needed to understand the fundamental laws of the universe. The quest of fundamental physics may be worthwhile in its own right, or may be important because understanding the universe gives us more tools to manipulate it. You might even think of Oppenheimer’s story in that way: because physicists understood the nature of the atom, they could apply that knowledge to change the world, racing to use it to defeat the Nazis and later convinced to continue to avoid a brutal invasion of Japan. (Whether the bomb was actually necessary to do this is still, of course, quite controversial.)

If that’s why you want more kids to be physicists, then you want a story like that. You could riff off of Ashoke Sen’s idea that physics may be essential to save humanity. The laws of physics appear to be unstable, such that at some point the world will shift and a “bubble”, expanding at the speed of light, will rewrite the rules in a way that would destroy all life as we know it. The only way to escape would be to travel faster than light, something that is possible because the universe itself expands at those speeds. By scattering “generation ships” in different directions, we could ensure that some of humanity would survive any such “bubble”: but only if we got the physics right.

A movie based on that idea could look a bit like the movie Cloud Atlas, with connected characters spanning multiple time periods. Scientists in the modern day investigate the expanding universe, making plans that refugees in a future generation ship must carry out. If you want to inspire kids with the idea that physics could save the world, you could get a lot of mileage out of a story that could actually be true.

On the other hand, maybe you don’t care so much about fundamental physics. Maybe you want more physicists because they’re good at solving a variety of problems. They help to invent new materials, to measure things precisely, to predict the weather, change computation, and even contribute to medicine. Maybe you want to tell a story about that.

(Maybe you even want these kids to go farther afield, and study physics without actually becoming physicists. Sam Altman is not a physicist, and I’ve heard he’s not very interested in directing his philanthropic money to increasing the number of jobs for physicists. On the other hand, the AI industry where he is a central player does hire a lot of ex-physicists.)

The problem, as Orzel points out, is that those stories aren’t really stories about physicists. They’re stories about engineering and technology, and a variety of other scientists, because a wide variety of people contribute to these problems. In order to tell a story that inspires people to be physicists, you need a story that highlights something unique that they bring to the table.

Orzel gets close to what I think of as the solution, by bringing up The Social Network. Altman was also mocked for saying that The Social Network motivated kids to found startups: the startup founders in that movie are not exactly depicted as good people. But in reality, it appears that the movie did motivate people to found startups. Stories about badass amoral jerks are engaging, and it’s easy to fantasize about having that kind of power and ability. There’s a reason that The Imitation Game depicted Alan Turing, a man known for his gentle kindness, as brusque and arrogant.

If you want to tell a story about physicists, it’s actually pretty easy, because physicists can be quite arrogant! There is a stereotype of physicists walking into another field, deciding they know everything they need to know, and lecturing the experts about how they should be doing their jobs. This really does happen, and sometimes it’s exactly as dumb as it sounds…but sometimes the physicists are right! Orzel brings up Feynman’s role in figuring out how the Challenger space shuttle blew up, an example of precisely this kind of success.

So if you want kids to grow up to be generalist physicists, people who solve all sorts of problems for all sorts of people, you need to tell them a story like that. One with a Sherlock-esque physicist who runs around showing how much smarter they are than everyone else. You need to make a plot where they physicist waves around “physicist tools”, like dimensional analysis, Fermi estimates, and thermodynamics, and uses them to uncover a mystery, showing a bunch of engineers or biologists just how much cooler they are.

If you do that, you probably could inspire some kids to become physicists. You’ll need a new movie to inspire them to be engineers or biologists, though!

Small Shifts for Specificity

Cosmologists are annoyed at a recent spate of news articles claiming the universe is 26.7 billion years old (rather than 13.8 billion as based on the current best measurements). To some of the science-reading public, the news sounds like a confirmation of hints they’d already heard: about an ancient “Methuselah” star that seemed to be older than the universe (later estimates put it younger), and recent observations from the James Webb Space Telescope of early galaxies that look older than they ought.

“The news doesn’t come from a telescope, though, or a new observation of the sky. Instead, it comes from this press release from the University of Ottawa: “Reinventing cosmology: uOttawa research puts age of universe at 26.7 — not 13.7 — billion years”.

(If you look, you’ll find many websites copying this press release almost word-for-word. This is pretty common in science news, where some websites simply aggregate press releases and others base most of their science news on them rather than paying enough for actual journalism.)

The press release, in turn, is talking about a theory, not an observation. The theorist, Rajendra Gupta, was motivated by examples like the early galaxies observed by JWST and the Methuselah star. Since the 13.8 billion year age of the universe is based on a mathematical model, he tried to find a different mathematical model that led to an older universe. Eventually, by hypothesizing what seems like every unproven physics effect he could think of, he found one that gives a different estimate, 26.7 billion. He probably wasn’t the first person to do this, because coming up with different models to explain odd observations is a standard thing cosmologists do all the time, and until one of the models is shown to explain a wider range of observations (because our best theories explain a lot, so they’re hard to replace), they’re just treated as speculation, not newsworthy science.

This is a pretty clear case of hype, and as such most of the discussion has been about what went wrong. Should we blame the theorist? The university? The journalists? Elon Musk?

Rather than blame, I think it’s more productive to offer advice. And in this situation, the person I think could use some advice is the person who wrote the press release.

So suppose you work for a university, writing their press releases. One day, you hear that one of your professors has done something very cool, something worthy of a press release: they’ve found a new estimate for the age of the universe. What do you do?

One thing you absolutely shouldn’t do is question the science. That just isn’t your job, and even if it were you don’t have the expertise to do that. Anyone who’s hoping that you will only write articles about good science and not bad science is being unrealistic, that’s just not an option.

If you can’t be more accurate, though, you can still be more precise. You can write your article, and in particular your headline, so that you express what you do know as clearly and specifically as possible.

(I’m assuming here you write your own headlines. This is not normal in journalism, where most headlines are written by an editor, not by the writer of a piece. But university press offices are small enough that I’m assuming, perhaps incorrectly, that you can choose how to title your piece.)

Let’s take a look at the title, “Reinventing cosmology: uOttawa research puts age of universe at 26.7 — not 13.7 — billion years”, and see if we can make some small changes to improve it.

One very general word in that title is “research”. Lots of people do research: astronomers do research when they collect observations, theorists do research when they make new models. If you say “research”, some people will think you’re reporting a new observation, a new measurement that gives a radically different age for the universe.

But you know that’s not true, it’s not what the scientist you’re talking to is telling you. So to avoid the misunderstanding, you can get a bit more specific, and replace the word “research” with a more precise one: “Reinventing cosmology: uOttawa theory puts age of universe at 26.7 — not 13.7 — billion years”.

“Theory” is just as familiar a word as “research”. You won’t lose clicks, you won’t confuse people. But now, you’ve closed off a big potential misunderstanding. By a small shift, you’ve gotten a lot clearer. And you didn’t need to question the science to do it!

You can do more small shifts, if you understand a bit more of the science. “Puts” is kind of ambiguous: a theory could put an age somewhere because it computes it from first principles, or because it dialed some parameter to get there. Here, the theory was intentionally chosen to give an older universe, so the title should hint at this in some way. Instead of “puts”, then, you can use “allows”: “Reinventing cosmology: uOttawa theory allows age of universe to be 26.7 — not 13.7 — billion years”.

These kinds of little tricks can be very helpful. If you’re trying to avoid being misunderstood, then it’s good to be as specific as you can, given what you understand. If you do it carefully, you don’t have to question your scientists’ ideas or downplay their contributions. You can do your job, promote your scientists, and still contribute to responsible journalism.

What RIBs Could Look Like

The journal Nature recently published an opinion piece about a new concept for science funding called Research Impact Bonds (or RIBs).

Normally, when a government funds something, they can’t be sure it will work. They pay in advance, and have to guess whether a program will do what they expect, or whether a project will finish on time. Impact bonds are a way for them to pay afterwards, so they only pay for projects that actually deliver. Instead, the projects are funded by private investors, who buy “impact bonds” that guarantee them a share of government funding if the project is successful. Here’s an example given in the Nature piece:

For instance, say the Swiss government promises to pay up to one million Swiss francs (US$1.1 million) to service providers that achieve a measurable outcome, such as reducing illiteracy in a certain population by 5%, within a specified number of years. A broker finds one or more service providers that think they can achieve this at a cost of, say, 900,000 francs, as well as investors who agree to pay these costs up front — thus taking on the risk of the project — for a potential 10% gain if successful. If the providers achieve their goals, the government pays 990,000 francs: 900,000 francs for the work and a 90,000-franc investment return. If the project does not succeed, the investors lose their money, but the government does not.

The author of the piece, Michael Hill, thinks that this could be a new way for governments to fund science. In his model, scientists would apply to the government to propose new RIBs. The projects would have to have specific goals and time-frames: “measure the power of this cancer treatment to this accuracy in five years”, for example. If the government thinks the goal is valuable, they commit to paying some amount of money if the goal is reached. Then investors can decide whether the investment is worthwhile. The projects they expect to work get investor money, and if they do end up working the investors get government money. The government only has to pay if the projects work, but the scientists get paid regardless.

Ok, what’s the catch?

One criticism I’ve seen is that this kind of model could only work for very predictable research, maybe even just for applied research. While the author admits RIBs would only be suitable for certain sorts of projects, I think the range is wider than you might think. The project just has to have a measurable goal by a specified end date. Many particle physics experiments work that way: a dark matter detector, for instance, is trying to either rule out or detect dark matter to a certain level of statistical power within a certain run time. Even “discovery” machines, that we build to try to discover the unexpected, usually have this kind of goal: a bigger version of the LHC, for instance, might try to measure the coupling of Higgs bosons to a certain accuracy.

There are a few bigger issues with this model, though. If you go through the math in Hill’s example, you’ll notice that if the project works, the government ends up paying one million Swiss francs for a service that only cost the provider 900,000 Swiss francs. Under a normal system, the government would only have had to pay 900,000. This gets compensated by the fact that not every project works, so the government only pays for some projects and not others. But investors will be aware of this, and that means the government can’t offer too many unrealistic RIBs: the greater the risk investors are going to take, the more return they’ll expect. On average then, the government would have to pay about as much as they would normally: the cost of the projects that succeed, plus enough money to cover the risk that some fail. (In fact, they’d probably pay a bit more, to give the investors a return on the investment.)

So the government typically won’t save money, at least not if they want to fund the same amount of research. Instead, the idea is that they will avoid risk. But it’s not at all clear to me that the type of risk they avoid is one they want to.

RIBs might appeal to voters: it might sound only fair that a government only funds the research that actually works. That’s not really a problem for the government itself, though: because governments usually pay for many small projects, they still get roughly as much success overall as they want, they just don’t get to pick where. Instead, RIBS put the government agency in a much bigger risk, the risk of unexpected success. As part of offering RIBs, the government would have to estimate how much money they would be able to pay when the projects end. They would want to fund enough projects so that, on average, they pay that amount of money. (Otherwise, they’d end up funding science much less than they do now!) But if the projects work out better than expected, then they’d have to pay much more than they planned. And government science agencies usually can’t do this. In many countries, they can’t plan far in advance at all: their budgets get decided by legislators year to year, and delays in decisions mean delays in funding. If an agency offered RIBs that were more successful than expected, they’d either have to cut funding somewhere else (probably firing a lot of people), or just default on their RIBs, weakening the concept for the next time they used them. These risks, unlike the risk of individual experiments not working, are risks that can really hurt government agencies.

Impact bonds typically have another advantage, in that they spread out decision-making. The Swiss government in Hill’s example doesn’t have to figure out which service providers can increase literacy, or how much it will cost them: it just puts up a budget, and lets investors and service providers figure out if they can make it work. This also serves as a hedge against corruption. If the government made the decisions, they might distribute funding for unrelated political reasons or even out of straight-up bribery. They’d also have to pay evaluators to figure things out. Investors won’t take bribes to lose money, so in theory would be better at choosing projects that will actually work, and would have a vested interest in paying for a good investigation.

This advantage doesn’t apply to Hill’s model of RIBs, though. In Hill’s model, scientists still need to apply to the government to decide which of their projects get offered as RIBs, so the government still needs to decide which projects are worth investing in. Then the scientists or the government need to take another step, and convince investors. The scientists in this equation effectively have to apply twice, which anyone who has applied for a government grant will realize is quite a lot of extra time and effort.

So overall, I don’t think Hills’ model of RIBs is useful, even for the purpose he imagines. It’s too risky for government science agencies to commit to payments like that, and it generates more, not less, work for scientists and the agency.

Hill’s model, though, isn’t the only way RIBs can work. And “avoiding risk” isn’t the only reason we might want them. There are two other reasons one might want RIBs, with very different-sounding motivations.

First, you might be pessimistic about mainstream science. Maybe you think scientists are making bad decisions, choosing ideas that either won’t pan out or won’t have sufficient impact, based more on fashion than on careful thought. You want to incentivize them to do better, to try to work out what impact they might have with some actual numbers and stand by their judgement. If that’s your perspective, you might be interested in RIBs for the same reason other people are interested in prediction markets: by getting investors involved, you have people willing to pay for an accurate estimate.

Second, you might instead be optimistic about mainstream science. You think scientists are doing great work, work that could have an enormous impact, but they don’t get to “capture that value”. Some projects might be essential to important, well-funded goals, but languish unrewarded. Others won’t see their value until long in the future, or will do so in unexpected ways. If scientists could fund projects based on their future impact, with RIBs, maybe they could fund more of this kind of work.

(I first started thinking about this perspective due to a talk by Sabrina Pasterski. The talk itself offended a lot of people, and had some pretty impractical ideas, like selling NFTs of important physics papers. But I think one part of the perspective, that scientists have more impact than we think, is worth holding on to.)

If you have either of those motivations, Hill’s model won’t help. But another kind of model perhaps could. Unlike Hill’s, it could fund much more speculative research, ideas where we don’t know the impact until decades down the line. To demonstrate, I’ll show how it could fund some very speculative research: the work of Peter van Nieuwenhuizen.

Peter van Nieuwenhuizen is one of the pioneers of the theory of supergravity, a theory that augments gravity with supersymmetric partner particles. From its beginnings in the 1970’s, the theory ended up having a major impact on string theory, and today they are largely thought of as part of the same picture of how the universe might work.

His work has, over time, had more practical consequences though. In the 2000’s, researchers working with supergravity noticed a calculational shortcut: they could do a complicated supergravity calculation as the “square” of a much simpler calculation in another theory, called Yang-Mills. Over time, they realized the shortcut worked not just for supergravity, but for ordinary gravity as well, and not just for particle physics calculations but for gravitational wave calculations. Now, their method may make an important contribution to calculations for future gravitational wave telescopes like the Einstein telescope, letting them measure properties of neutron stars.

With that in mind, imagine the following:

In 1967, Jocelyn Bell Burnell and Antony Hewish detected a pulsar, in one of the first direct pieces of evidence for the existence of neutron stars. Suppose that in the early 1970’s NASA decided to announce a future purchase of RIBs: in 2050, they would pay a certain amount to whoever was responsible for finding the equation of state of a neutron star, the formula that describes how its matter moves under pressure. They compute based on estimates of economic growth and inflation, and arrive at some suitably substantial number.

At the same time, but unrelatedly, van Nieuwenhuizen and collaborators sell RIBs. Maybe they use the proceeds to buy more computer time for their calculations, or to refund travel so they can more easily meet and discuss. They tell the buyers that, if some government later decides to reward their discoveries, the holders of the RIB would get a predetermined cut of the rewards.

The years roll by, and barring some unexpected medical advances the discoverers of supergravity die. In the meantime, researchers use their discovery to figure out how to make accurate predictions of gravitational waves from merging neutron stars. When the Einstein telescope turns out, it detects such a merger, and the accurate predictions let them compute the neutron star’s equation of state.

In 2050, then, NASA looks back. They make a list of everyone who contributed to the discovery of the neutron star’s equation of state, every result that was needed for the discovery, and try to estimate how important each contribution was. Then they spend the money they promised buying RIBs, up to the value for each contributor. This includes RIBs originally held by the investors in van Nieuwenhuizen and collaborators. Their current holders make some money, justifying whatever value they paid from their previous owners.

Imagine a world in which government agencies do this kind of thing all the time. Scientists could sell RIBs in their projects, without knowing exactly which agency would ultimately pay for them. Rather than long grant applications, they could write short summaries for investors, guessing at the range of their potential impact, and it would be up to the investors to decide whether the estimate made sense. Scientists could get some of the value of their discoveries, even when that value is quite unpredictable. And they would be incentivized to pick discoveries that could have high impact, and to put a bit of thought and math into what kind of impact that could be.

(Should I still be calling these things bonds, when the buyers don’t know how much they’ll be worth at the end? Probably not. These are more like research impact shares, on a research impact stock market.)

Are there problems with this model, then? Oh sure, loads!

I already mentioned that it’s hard for government agencies to commit to spending money five years down the line. A seventy-year commitment, from that perspective, sounds completely ridiculous.

But we don’t actually need that in the model. All we need is a good reason for investors to think that, eventually, NASA will buy some research impact shares. If government agencies do this regularly, then they would have that reason. They could buy a variety of theoretical developments, a diversified pool to make it more likely some government agency would reward them. This version of the model would be riskier, though, so they’d want more return in exchange.

Another problem is the decision-making aspect. Government agencies wouldn’t have to predict the future, but they would have to accurately assess the past, fairly estimating who contributed to a project, and they would have to do it predictably enough that it could give rise to worthwhile investments. This is itself both controversial and a lot of work. If we figure out the neutron star equation of state, I’m not sure I trust NASA to reward van Nieuwenhuizen’s contribution to it.

This leads to the last modification of the model, and the most speculative one. Over time, government agencies will get better and better at assigning credit. Maybe they’ll have better models of how scientific progress works, maybe they’ll even have advanced AI. A future government (or benevolent AI, if you’re into that) might decide to buy research impact shares in order to validate important past work.

If you believe that might happen, then you don’t need a track record of government agencies buying research impact shares. As a scientist, you can find a sufficiently futuristically inclined investor, and tell them this story. You can sell them some shares, and tell them that, when the AI comes, they will have the right to whatever benefit it bestows upon your research.

I could imagine some people doing this. If you have an image of your work saving humanity in the distant future, you should be able to use that image to sell something to investors. It would be insanely speculative, a giant pile of what-ifs with no guarantee of any of it cashing out. But at least it’s better than NFTs.

Learning for a Living

It’s a question I’ve now heard several times, in different forms. People hear that I’ll be hired as a researcher at an institute of theoretical physics, and they ask, “what, exactly, are they paying you to research?”

The answer, with some caveats: “Whatever I want.”

When a company hires a researcher, they want to accomplish specific things: to improve their products, to make new ones, to cut down on fraud or out-think the competition. Some government labs are the same: if you work for NIST, for example, your work should contribute in some way to achieving more precise measurements and better standards for technology.

Other government labs, and universities, are different. They pursue basic research, research not on any specific application but on the general principles that govern the world. Researchers doing basic research are given a lot of freedom, and that freedom increases as their careers go on.

As a PhD student, a researcher is a kind of apprentice, working for their advisor. Even then, they have some independence: an advisor may suggest projects, but PhD students usually need to decide how to execute them on their own. In some fields, there can be even more freedom: in theoretical physics, it’s not unusual for the more independent students to collaborate with other people than just their advisor.

Postdocs, in turn, have even more freedom. In some fields they get hired to work on a specific project, but they tend to have more freedom as to how to execute it than a PhD student would. Other fields give them more or less free rein: in theoretical physics, a postdoc will have some guidance, but often will be free to work on whatever they find interesting.

Professors, and other long-term researchers, have the most freedom of all. Over the climb from PhD to postdoc to professor, researchers build judgement, demonstrating a track record for tackling worthwhile scientific problems. Universities, and institutes of basic research, trust that judgement. They hire for that judgement. They give their long-term researchers free reign to investigate whatever questions they think are valuable.

In practice, there are some restrictions. Usually, you’re supposed to research in a particular field: at an institute for theoretical physics, I should probably research theoretical physics. (But that can mean many things: one of my future colleagues studies the science of cities.) Further pressure comes from grant funding, money you need to hire other researchers or buy equipment that can come with restrictions attached. When you apply for a grant, you have to describe what you plan to do. (In practice, grant agencies are more flexible about this than you might expect, allowing all sorts of changes if you have a good reason…but you still can’t completely reinvent yourself.) Your colleagues themselves also have an impact: it’s much easier to work on something when you can walk down the hall and ask an expert when you get stuck. It’s why we seek out colleagues who care about the same big questions as we do.

Overall, though, research is one of the free-est professions there is. If you can get a job learning for a living, and do it well enough, then people will trust your judgement. They’ll set you free to ask your own questions, and seek your own answers.

Traveling This Week

I’m traveling this week, so this will just be a short post. This isn’t a scientific trip exactly: I’m in Poland, at an event connected to the 550th anniversary of the birth of Copernicus.

Not this one, but they do have nice posters!

Part of this event involved visiting the Copernicus Science Center, the local children’s science museum. The place was sold out completely. For any tired science communicators, I recommend going to a sold-out science museum: the sheer enthusiasm you’ll find there is balm for the most jaded soul.

Whatever Happened to the Nonsense Merchants?

I was recently reminded that Michio Kaku exists.

In the past, Michio Kaku made important contributions to string theory, but he’s best known for what could charitably be called science popularization. He’s an excited promoter of physics and technology, but that excitement often strays into inaccuracy. Pretty much every time I’ve heard him mentioned, it’s for some wildly overenthusiastic statement about physics that, rather than just being simplified for a general audience, is generally flat-out wrong, conflating a bunch of different developments in a way that makes zero actual sense.

Michio Kaku isn’t unique in this. There’s a whole industry in making nonsense statements about science, overenthusiastic books and videos hinting at science fiction or mysticism. Deepak Chopra is a famous figure from deeper on this spectrum, known for peddling loosely quantum-flavored spirituality.

There was a time I was worried about this kind of thing. Super-popular misinformation is the bogeyman of the science popularizer, the worry that for every nice, careful explanation we give, someone else will give a hundred explanations that are way more exciting and total baloney. Somehow, though, I hear less and less from these people over time, and thus worry less and less about them.

Should I be worried more? I’m not sure.

Are these people less popular than they used to be? Is that why I’m hearing less about them? Possibly, but I’d guess not. Michio Kaku has eight hundred thousand twitter followers. Deepak Chopra has three million. On the other hand, the usually-careful Brian Greene has a million followers, and Neil deGrasse Tyson, where the worst I’ve heard is that he can be superficial, has fourteen million.

(But then in practice, I’m more likely to reflect on content with even smaller audiences.)

If misinformation is this popular, shouldn’t I be doing more to combat it?

Popular misinformation is also going to be popular among critics. For every big-time nonsense merchant, there are dozens of people breaking down and debunking every false statement they say, every piece of hype they release. Often, these people will end up saying the same kinds of things over and over again.

If I can be useful, I don’t think it will be by saying the same thing over and over again. I come up with new metaphors, new descriptions, new explanations. I clarify things others haven’t clarified, I clear up misinformation others haven’t addressed. That feels more useful to me, especially in a world where others are already countering the big problems. I write, and writing lasts, and can be used again and again when needed. I don’t need to keep up with the Kakus and Chopras of the world to do that.

(Which doesn’t imply I’ll never address anything one of those people says…but if I do, it will be because I have something new to say back!)

Why Are Universities So International?

Worldwide, only about one in thirty people live in a different country from where they were born. Wander onto a university campus, though, and you may get a different impression. The bigger the university and the stronger its research, the more international its employees become. You’ll see international PhD students, international professors, and especially international temporary researchers like postdocs.

I’ve met quite a few people who are surprised by this. I hear the same question again and again, from curious Danes at outreach events to a tired border guard in the pre-clearance area of the Toronto airport: why are you, an American, working here?

It’s not, on the face of it, an unreasonable question. Moving internationally is hard and expensive. You may have to take your possessions across the ocean, learn new languages and customs, and navigate an unfamiliar bureaucracy. You begin as a temporary resident, not a citizen, with all the risks and uncertainty that involves. Given a choice, most people choose to stay close to home. Countries sometimes back up this choice with additional incentives. There are laws in many places that demand that, given a choice, companies hire a local instead of a foreigner. In some places these laws apply to universities as well. With all that weight, why do so many researchers move abroad?

Two different forces stir the pot, making universities international: specialization, and diversification.

Researchers may find it easier to live close to people who grew up with us, but we work better near people who share our research interests. Science, and scholarship more generally, are often collaborative: we need to discuss with and learn from others to make progress. That’s still very hard to do remotely: it requires serendipity, chance encounters in the corridor and chats at the lunch table. As researchers in general have become more specialized, we’ve gotten to the point where not just any university will do: the people who do our kind of work are few enough that we often have to go to other countries to find them.

Specialization alone would tend to lead to extreme clustering, with researchers in each area gathering in only a few places. Universities push back against this, though. A university wants to maximize the chance that one of their researchers makes a major breakthrough, so they don’t want to hire someone whose work will just be a copy of someone they already have. They want to encourage interdisciplinary collaboration, to try to get people in different areas to talk to each other. Finally, they want to offer a wide range of possible courses, to give the students (many of whom are still local), a chance to succeed at many different things. As a result, universities try to diversify their faculty, to hire people from areas that, while not too far for meaningful collaboration, are distinct from what their current employees are doing.

The result is a constant international churn. We search for jobs in a particular sweet spot: with people close enough to spur good discussion, but far enough to not overspecialize. That search takes us all over the world, and all but guarantees we won’t find a job where we were trained, let alone where we were born. It makes universities quite international places, with a core of local people augmented by opportune choices from around the world. It makes us, and the way we lead our lives, quite unusual on a global scale. But it keeps the science fresh, and the ideas moving.

AI Is the Wrong Sci-Fi Metaphor

Over the last year, some people felt like they were living in a science fiction novel. Last November, the research laboratory OpenAI released ChatGPT, a program that can answer questions on a wide variety of topics. Last month, they announced GPT-4, a more powerful version of ChatGPT’s underlying program. Already in February, Microsoft used GPT-4 to add a chatbot feature to its search engine Bing, which journalists quickly managed to use to spin tales of murder and mayhem.

For those who have been following these developments, things don’t feel quite so sudden. Already in 2019, AI Dungeon showed off how an early version of GPT could be used to mimic an old-school text-adventure game, and a tumblr blogger built a bot that imitates his posts as a fun side project. Still, the newer programs have shown some impressive capabilities.

Are we close to “real AI”, to artificial minds like the positronic brains in Isaac Asimov’s I, Robot? I can’t say, in part because I’m not sure what “real AI” really means. But if you want to understand where things like ChatGPT come from, how they work and why they can do what they do, then all the talk of AI won’t be helpful. Instead, you need to think of an entirely different set of Asimov novels: the Foundation series.

While Asimov’s more famous I, Robot focused on the science of artificial minds, the Foundation series is based on a different fictional science, the science of psychohistory. In the stories, psychohistory is a kind of futuristic social science. In the real world, historians and sociologists can find general principles of how people act, but don’t yet have the kind of predictive theories physicists or chemists do. Foundation imagines a future where powerful statistical methods have allowed psychohistorians to precisely predict human behavior: not yet that of individual people, but at least the average behavior of civilizations. They can not only guess when an empire is soon to fall, but calculate how long it will be before another empire rises, something few responsible social scientists would pretend to do today.

GPT and similar programs aren’t built to predict the course of history, but they do predict something: given part of a text, they try to predict the rest. They’re called Large Language Models, or LLMs for short. They’re “models” in the sense of mathematical models, formulas that let us use data to make predictions about the world, and the part of the world they model is our use of language.

Normally, a mathematical model is designed based on how we think the real world works. A mathematical model of a pandemic, for example, might use a list of people, each one labeled as infected or not. It could include an unknown number, called a parameter, for the chance that one person infects another. That parameter would then be filled in, or fixed, based on observations of the pandemic in the real world.

LLMs (as well as most of the rest of what people call “AI” these days) are a bit different. Their models aren’t based on what we expect about the real world. Instead, they’re in some sense “generic”, models that could in principle describe just about anything. In order to make this work, they have a lot more parameters, tons and tons of flexible numbers that can get fixed in different ways based on data.

(If that part makes you a bit uncomfortable, it bothers me too, though I’ve mostly made my peace with it.)

The surprising thing is that this works, and works surprisingly well. Just as psychohistory from the Foundation novels can predict events with much more detail than today’s historians and sociologists, LLMs can predict what a text will look like much more precisely than today’s literature professors. That isn’t necessarily because LLMs are “intelligent”, or because they’re “copying” things people have written. It’s because they’re mathematical models, built by statistically analyzing a giant pile of texts.

Just as Asimov’s psychohistory can’t predict the behavior of individual people, LLMs can’t predict the behavior of individual texts. If you start writing something, you shouldn’t expect an LLM to predict exactly how you would finish. Instead, LLMs predict what, on average, the rest of the text would look like. They give a plausible answer, one of many, for what might come next.

They can’t do that perfectly, but doing it imperfectly is enough to do quite a lot. It’s why they can be used to make chatbots, by predicting how someone might plausibly respond in a conversation. It’s why they can write fiction, or ads, or college essays, by predicting a plausible response to a book jacket or ad copy or essay prompt.

LLMs like GPT were invented by computer scientists, not social scientists or literature professors. Because of that, they get described as part of progress towards artificial intelligence, not as progress in social science. But if you want to understand what ChatGPT is right now, and how it works, then that perspective won’t be helpful. You need to put down your copy of I, Robot and pick up Foundation. You’ll still be impressed, but you’ll have a clearer idea of what could come next.