Monthly Archives: April 2023

Whatever Happened to the Nonsense Merchants?

I was recently reminded that Michio Kaku exists.

In the past, Michio Kaku made important contributions to string theory, but he’s best known for what could charitably be called science popularization. He’s an excited promoter of physics and technology, but that excitement often strays into inaccuracy. Pretty much every time I’ve heard him mentioned, it’s for some wildly overenthusiastic statement about physics that, rather than just being simplified for a general audience, is generally flat-out wrong, conflating a bunch of different developments in a way that makes zero actual sense.

Michio Kaku isn’t unique in this. There’s a whole industry in making nonsense statements about science, overenthusiastic books and videos hinting at science fiction or mysticism. Deepak Chopra is a famous figure from deeper on this spectrum, known for peddling loosely quantum-flavored spirituality.

There was a time I was worried about this kind of thing. Super-popular misinformation is the bogeyman of the science popularizer, the worry that for every nice, careful explanation we give, someone else will give a hundred explanations that are way more exciting and total baloney. Somehow, though, I hear less and less from these people over time, and thus worry less and less about them.

Should I be worried more? I’m not sure.

Are these people less popular than they used to be? Is that why I’m hearing less about them? Possibly, but I’d guess not. Michio Kaku has eight hundred thousand twitter followers. Deepak Chopra has three million. On the other hand, the usually-careful Brian Greene has a million followers, and Neil deGrasse Tyson, where the worst I’ve heard is that he can be superficial, has fourteen million.

(But then in practice, I’m more likely to reflect on content with even smaller audiences.)

If misinformation is this popular, shouldn’t I be doing more to combat it?

Popular misinformation is also going to be popular among critics. For every big-time nonsense merchant, there are dozens of people breaking down and debunking every false statement they say, every piece of hype they release. Often, these people will end up saying the same kinds of things over and over again.

If I can be useful, I don’t think it will be by saying the same thing over and over again. I come up with new metaphors, new descriptions, new explanations. I clarify things others haven’t clarified, I clear up misinformation others haven’t addressed. That feels more useful to me, especially in a world where others are already countering the big problems. I write, and writing lasts, and can be used again and again when needed. I don’t need to keep up with the Kakus and Chopras of the world to do that.

(Which doesn’t imply I’ll never address anything one of those people says…but if I do, it will be because I have something new to say back!)

What’s a Cosmic String?

Nowadays, we have telescopes that detect not just light, but gravitational waves. We’ve already learned quite a bit about astrophysics from these telescopes. They observe ripples coming from colliding black holes, giving us a better idea of what kinds of black holes exist in the universe. But the coolest thing a gravitational wave telescope could discover is something that hasn’t been seen yet: a cosmic string.

This art is from an article in Symmetry magazine which is, as far as I can tell, not actually about cosmic strings.

You might have heard of cosmic strings, but unless you’re a physicist you probably don’t know much about them. They’re a prediction, coming from cosmology, of giant string-like objects floating out in space.

That might sound like it has something to do with string theory, but it doesn’t actually have to, you can have these things without any string theory at all. Instead, you might have heard that cosmic strings are some kind of “cracks” or “wrinkles” in space-time. Some articles describe this as like what happens when ice freezes, cracks forming as water settles into a crystal.

That description, in terms of ice forming cracks between crystals, is great…if you’re a physicist who already knows how ice forms cracks between crystals. If you’re not, I’m guessing reading those kinds of explanations isn’t helpful. I’m guessing you’re still wondering why there ought to be any giant strings floating in space.

The real explanation has to do with a type of mathematical gadget physicists use, called a scalar field. You can think of a scalar field as described by a number, like a temperature, that can vary in space and time. The field carries potential energy, and that energy depends on what the scalar field’s “number” is. Left alone, the field settles into a situation with as little potential energy as it can, like a ball rolling down a hill. That situation is one of the field’s default values, something we call a “vacuum” value. Changing the field away from its vacuum value can take a lot of energy. The Higgs boson is one example of a scalar field. Its vacuum value is the value it has in day to day life. In order to make a detectable Higgs boson at the Large Hadron Collider, they needed to change the field away from its vacuum value, and that took a lot of energy.

In the very early universe, almost back at the Big Bang, the world was famously in a hot dense state. That hot dense state meant that there was a lot of energy to go around, so scalar fields could vary far from their vacuum values, pretty much randomly. As the universe expanded and cooled, there was less and less energy available for these fields, and they started to settle down.

Now, the thing about these default, “vacuum” values of a scalar field is that there doesn’t have to be just one of them. Depending on what kind of mathematical function the field’s potential energy is, there could be several different possibilities each with equal energy.

Let’s imagine a simple example, of a field with two vacuum values: +1 and -1. As the universe cooled down, some parts of the universe would end up with that scalar field number equal to +1, and some to -1. But what happens in between?

The scalar field can’t just jump from -1 to +1, that’s not allowed in physics. It has to pass through 0 in between. But, unlike -1 and +1, 0 is not a vacuum value. When the scalar field number is equal to 0, the field has more energy than it does when it’s equal to -1 or +1. Usually, a lot more energy.

That means the region of scalar field number 0 can’t spread very far: the further it spreads, the more energy it takes to keep it that way. On the other hand, the region can’t vanish altogether: something needs to happen to transition between the numbers -1 and +1.

The thing that happens is called a domain wall. A domain wall is a thin sheet, as thin as it can physically be, where the scalar field doesn’t take its vacuum value. You can roughly think of it as made up of the scalar field, a churning zone of the kind of bosons the LHC was trying to detect.

This sheet still has a lot of energy, bound up in the unusual value of the scalar field, like an LHC collision in every proton-sized chunk. As such, like any object with a lot of energy, it has a gravitational field. For a domain wall, the effect of this gravity would be very very dramatic: so dramatic, that we’re pretty sure they’re incredibly rare. If they were at all common, we would have seen evidence of them long before now!

Ok, I’ve shown you a wall, that’s weird, sure. What does that have to do with cosmic strings?

The number representing a scalar field doesn’t have to be a real number: it can be imaginary instead, or even complex. Now I’d like you to imagine a field with vacuum values on the unit circle, in the complex plane. That means that +1 and -1 are still vacuum values, but so are e^{i \pi/2}, and e^{3 i \pi/2}, and everything else you can write as e^{i\theta}. However, 0 is still not a vacuum value. Neither is, for example, 2 e^{i\pi/3}.

With vacuum values like this, you can’t form domain walls. You can make a path between -1 and +1 that only goes through the unit circle, through e^{i \pi/2} for example. The field will be at its vacuum value throughout, taking no extra energy.

However, imagine the different regions form a circle. In the picture above, suppose that the blue area at the bottom is at vacuum value -1 and red is at +1. You might have e^{i \pi/2} in the green region, and e^{3 i \pi/2} in the purple region, covering the whole circle smoothly as you go around.

Now, think about what happens in the middle of the circle. On one side of the circle, you have -1. On the other, +1. (Or, on one side e^{i \pi/2}, on the other, e^{3 i \pi/2}). No matter what, different sides of the circle are not allowed to be next to each other, you can’t just jump between them. So in the very middle of the circle, something else has to happen.

Once again, that something else is a field that goes away from its vacuum value, that passes through 0. Once again, that takes a lot of energy, so it occupies as little space as possible. But now, that space isn’t a giant wall. Instead, it’s a squiggly line: a cosmic string.

Cosmic strings don’t have as dramatic a gravitational effect as domain walls. That means they might not be super-rare. There might be some we haven’t seen yet. And if we do see them, it could be because they wiggle space and time, making gravitational waves.

Cosmic strings don’t require string theory, they come from a much more basic gadget, scalar fields. We know there is one quite important scalar field, the Higgs field. The Higgs vacuum values aren’t like +1 and -1, or like the unit circle, though, so the Higgs by itself won’t make domain walls or cosmic strings. But there are a lot of proposals for scalar fields, things we haven’t discovered but that physicists think might answer lingering questions in particle physics, and some of those could have the right kind of vacuum values to give us cosmic strings. Thus, if we manage to detect cosmic strings, we could learn something about one of those lingering questions.

Why Are Universities So International?

Worldwide, only about one in thirty people live in a different country from where they were born. Wander onto a university campus, though, and you may get a different impression. The bigger the university and the stronger its research, the more international its employees become. You’ll see international PhD students, international professors, and especially international temporary researchers like postdocs.

I’ve met quite a few people who are surprised by this. I hear the same question again and again, from curious Danes at outreach events to a tired border guard in the pre-clearance area of the Toronto airport: why are you, an American, working here?

It’s not, on the face of it, an unreasonable question. Moving internationally is hard and expensive. You may have to take your possessions across the ocean, learn new languages and customs, and navigate an unfamiliar bureaucracy. You begin as a temporary resident, not a citizen, with all the risks and uncertainty that involves. Given a choice, most people choose to stay close to home. Countries sometimes back up this choice with additional incentives. There are laws in many places that demand that, given a choice, companies hire a local instead of a foreigner. In some places these laws apply to universities as well. With all that weight, why do so many researchers move abroad?

Two different forces stir the pot, making universities international: specialization, and diversification.

Researchers may find it easier to live close to people who grew up with us, but we work better near people who share our research interests. Science, and scholarship more generally, are often collaborative: we need to discuss with and learn from others to make progress. That’s still very hard to do remotely: it requires serendipity, chance encounters in the corridor and chats at the lunch table. As researchers in general have become more specialized, we’ve gotten to the point where not just any university will do: the people who do our kind of work are few enough that we often have to go to other countries to find them.

Specialization alone would tend to lead to extreme clustering, with researchers in each area gathering in only a few places. Universities push back against this, though. A university wants to maximize the chance that one of their researchers makes a major breakthrough, so they don’t want to hire someone whose work will just be a copy of someone they already have. They want to encourage interdisciplinary collaboration, to try to get people in different areas to talk to each other. Finally, they want to offer a wide range of possible courses, to give the students (many of whom are still local), a chance to succeed at many different things. As a result, universities try to diversify their faculty, to hire people from areas that, while not too far for meaningful collaboration, are distinct from what their current employees are doing.

The result is a constant international churn. We search for jobs in a particular sweet spot: with people close enough to spur good discussion, but far enough to not overspecialize. That search takes us all over the world, and all but guarantees we won’t find a job where we were trained, let alone where we were born. It makes universities quite international places, with a core of local people augmented by opportune choices from around the world. It makes us, and the way we lead our lives, quite unusual on a global scale. But it keeps the science fresh, and the ideas moving.

AI Is the Wrong Sci-Fi Metaphor

Over the last year, some people felt like they were living in a science fiction novel. Last November, the research laboratory OpenAI released ChatGPT, a program that can answer questions on a wide variety of topics. Last month, they announced GPT-4, a more powerful version of ChatGPT’s underlying program. Already in February, Microsoft used GPT-4 to add a chatbot feature to its search engine Bing, which journalists quickly managed to use to spin tales of murder and mayhem.

For those who have been following these developments, things don’t feel quite so sudden. Already in 2019, AI Dungeon showed off how an early version of GPT could be used to mimic an old-school text-adventure game, and a tumblr blogger built a bot that imitates his posts as a fun side project. Still, the newer programs have shown some impressive capabilities.

Are we close to “real AI”, to artificial minds like the positronic brains in Isaac Asimov’s I, Robot? I can’t say, in part because I’m not sure what “real AI” really means. But if you want to understand where things like ChatGPT come from, how they work and why they can do what they do, then all the talk of AI won’t be helpful. Instead, you need to think of an entirely different set of Asimov novels: the Foundation series.

While Asimov’s more famous I, Robot focused on the science of artificial minds, the Foundation series is based on a different fictional science, the science of psychohistory. In the stories, psychohistory is a kind of futuristic social science. In the real world, historians and sociologists can find general principles of how people act, but don’t yet have the kind of predictive theories physicists or chemists do. Foundation imagines a future where powerful statistical methods have allowed psychohistorians to precisely predict human behavior: not yet that of individual people, but at least the average behavior of civilizations. They can not only guess when an empire is soon to fall, but calculate how long it will be before another empire rises, something few responsible social scientists would pretend to do today.

GPT and similar programs aren’t built to predict the course of history, but they do predict something: given part of a text, they try to predict the rest. They’re called Large Language Models, or LLMs for short. They’re “models” in the sense of mathematical models, formulas that let us use data to make predictions about the world, and the part of the world they model is our use of language.

Normally, a mathematical model is designed based on how we think the real world works. A mathematical model of a pandemic, for example, might use a list of people, each one labeled as infected or not. It could include an unknown number, called a parameter, for the chance that one person infects another. That parameter would then be filled in, or fixed, based on observations of the pandemic in the real world.

LLMs (as well as most of the rest of what people call “AI” these days) are a bit different. Their models aren’t based on what we expect about the real world. Instead, they’re in some sense “generic”, models that could in principle describe just about anything. In order to make this work, they have a lot more parameters, tons and tons of flexible numbers that can get fixed in different ways based on data.

(If that part makes you a bit uncomfortable, it bothers me too, though I’ve mostly made my peace with it.)

The surprising thing is that this works, and works surprisingly well. Just as psychohistory from the Foundation novels can predict events with much more detail than today’s historians and sociologists, LLMs can predict what a text will look like much more precisely than today’s literature professors. That isn’t necessarily because LLMs are “intelligent”, or because they’re “copying” things people have written. It’s because they’re mathematical models, built by statistically analyzing a giant pile of texts.

Just as Asimov’s psychohistory can’t predict the behavior of individual people, LLMs can’t predict the behavior of individual texts. If you start writing something, you shouldn’t expect an LLM to predict exactly how you would finish. Instead, LLMs predict what, on average, the rest of the text would look like. They give a plausible answer, one of many, for what might come next.

They can’t do that perfectly, but doing it imperfectly is enough to do quite a lot. It’s why they can be used to make chatbots, by predicting how someone might plausibly respond in a conversation. It’s why they can write fiction, or ads, or college essays, by predicting a plausible response to a book jacket or ad copy or essay prompt.

LLMs like GPT were invented by computer scientists, not social scientists or literature professors. Because of that, they get described as part of progress towards artificial intelligence, not as progress in social science. But if you want to understand what ChatGPT is right now, and how it works, then that perspective won’t be helpful. You need to put down your copy of I, Robot and pick up Foundation. You’ll still be impressed, but you’ll have a clearer idea of what could come next.