# What Are Students? We Just Don’t Know

I’m taking a pedagogy course at the moment, a term-long follow-up to the one-week intro course I took in the spring. The course begins with yet another pedagogical innovation, a “pre-project”. Before the course has really properly started, we get assembled into groups and told to investigate our students. We are supposed to do interviews on a few chosen themes, all with the objective of getting to know our students better. I’m guessing the point is to sharpen our goals, so that when we start learning pedagogy we’ll have a clearer idea of what problems we’d like to solve.

The more I think about this the more I’m looking forward to it. When I TAed in the past, some of the students were always a bit of a mystery. They sat in the back, skipped assignments, and gradually I saw less and less of them. They didn’t go to office hours or the help room, and I always wondered what happened. When in the course did they “turn off”, when did we lose them? They seemed like a kind of pedagogical dark matter, observable only by their presence on the rosters. I’m hoping to detect a little of that dark matter here.

As it’s a group project, we came up with a theme as a group, and questions to support that theme (in particular, we’re focusing on the different experiences between Danes and international students). Since the topic is on my mind in general though, I thought it would be fun to reach out to you guys. Educators in the comments: if you could ask your students one question, what would it be? Students, what is one thing you think your teachers are missing?

# Of p and sigma

Ask a doctor or a psychologist if they’re sure about something, and they might say “it has p<0.05”. Ask a physicist, and they’ll say it’s a “5 sigma result”. On the surface, they sound like they’re talking about completely different things. As it turns out, they’re not quite that different.

Whether it’s a p-value or a sigma, what scientists are giving you is shorthand for a probability. The p-value is the probability itself, while sigma tells you how many standard deviations something is away from the mean on a normal distribution. For people not used to statistics this might sound very complicated, but it’s not so tricky in the end. There’s a graph, called a normal distribution, and you can look at how much of it is above a certain point, measured in units called standard deviations, or “sigmas”. That gives you your probability.

What are these numbers a probability of? At first, you might think they’re a probability of the scientist being right: of the medicine working, or the Higgs boson being there.

That would be reasonable, but it’s not how it works. Scientists can’t measure the chance they’re right. All they can do is compare models. When a scientist reports a p-value, what they’re doing is comparing to a kind of default model, called a “null hypothesis”. There are different null hypotheses for different experiments, depending on what the scientists want to test. For the Higgs, scientists looked at pairs of photons detected by the LHC. The null hypothesis was that these photons were created by other parts of the Standard Model, like the strong force, and not by a Higgs boson. For medicine, the null hypothesis might be that people get better on their own after a certain amount of time. That’s hard to estimate, which is why medical experiments use a control group: a similar group without the medicine, to see how much they get better on their own.

Once we have a null hypothesis, we can use it to estimate how likely it is that it produced the result of the experiment. If there was no Higgs, and all those photons just came from other particles, what’s the chance there would still be a giant pile of them at one specific energy? If the medicine didn’t do anything, what’s the chance the control group did that much worse than the treatment group?

Ideally, you want a small probability here. In medicine and psychology, you’re looking for a 5% probability, for p<0.05. In physics, you need 5 sigma to make a discovery, which corresponds to a one in 3.5 million probability. If the probability is low, then you can say that it would be quite unlikely for your result to happen if the null hypothesis was true. If you’ve got a better hypothesis (the Higgs exists, the medicine works), then you should pick that instead.

Note that this probability still uses a model: it’s the probability of the result given that the model is true. It isn’t the probability that the model is true, given the result. That probability is more important to know, but trickier to calculate. To get from one to the other, you need to include more assumptions: about how likely your model was to begin with, given everything else you know about the world. Depending on those assumptions, even the tiniest p-value might not show that your null hypothesis is wrong.

In practice, unfortunately, we usually can’t estimate all of those assumptions in detail. The best we can do is guess their effect, in a very broad way. That usually just means accepting a threshold for p-values, declaring some a discovery and others not. That limitation is part of why medicine and psychology demand p-values of 0.05, while physicists demand 5 sigma results. Medicine and psychology have some assumptions they can rely on: that people function like people, that biology and physics keep working. Physicists don’t have those assumptions, so we have to be extra-strict.

Ultimately, though, we’re all asking the same kind of question. And now you know how to understand it when we do.

# Halloween Post: Superstimuli for Physicists

For Halloween, this blog has a tradition of covering “the spooky side” of physics. This year, I’m bringing in a concept from biology to ask a spooky physics “what if?”

In the 1950’s, biologists discovered that birds were susceptible to a worryingly effective trick. By giving them artificial eggs larger and brighter than their actual babies, they found that the birds focused on the new eggs to the exclusion of their own. They couldn’t help trying to hatch the fake eggs, even if they were so large that they would fall off when they tried to sit on them. The effect, since observed in other species, became known as a supernormal stimulus, or superstimulus.

Can this happen to humans? Some think so. They worry about junk food we crave more than actual nutrients, or social media that eclipses our real relationships. Naturally, this idea inspires horror writers, who write about haunting music you can’t stop listening to, or holes in a wall that “fit” so well you’re compelled to climb in.

(And yes, it shows up in porn as well.)

But this is a physics blog, not a biology blog. What kind of superstimulus would work on physicists?

Well for one, this sounds a lot like some criticisms of string theory. Instead of a theory that just unifies some forces, why not unify all the forces? Instead of just learning some advanced mathematics, why not learn more, and more? And if you can’t be falsified by any experiment, well, all that would do is spoil the fun, right?

But it’s not just string theory you could apply this logic to. Astrophysicists study not just one world but many. Cosmologists study the birth and death of the entire universe. Particle physicists study the fundamental pieces that make up the fundamental pieces. We all partake in the euphoria of problem-solving, a perpetual rush where each solution leads to yet another question.

Do I actually think that string theory is a superstimulus, that astrophysics or particle physics is a superstimulus? In a word, no. Much as it might look that way from the news coverage, most physicists don’t work on these big, flashy questions. Far from being lured in by irresistible super-scale problems, most physicists work with tabletop experiments and useful materials. For those of us who do look up at the sky or down at the roots of the world, we do it not just because it’s compelling but because it has a good track record: physics wouldn’t exist if Newton hadn’t cared about the orbits of the planets. We study extremes because they advance our understanding of everything else, because they give us steam engines and transistors and change everyone’s lives for the better.

Then again, if I had fallen victim to a superstimulus, I’d say that anyway, right?

*cue spooky music*

# The Point of a Model

I’ve been reading more lately, partially for the obvious reasons. Mostly, I’ve been catching up on books everyone else already read.

One such book is Daniel Kahneman’s “Thinking, Fast and Slow”. With all the talk lately about cognitive biases, Kahneman’s account of his research on decision-making was quite familiar ground. The book turned out to more interesting as window into the culture of psychology research. While I had a working picture from psychologist friends in grad school, “Thinking, Fast and Slow” covered the other side, the perspective of a successful professor promoting his field.

Most of this wasn’t too surprising, but one passage struck me:

Several economists and psychologists have proposed models of decision making that are based on the emotions of regret and disappointment. It is fair to say that these models have had less influence than prospect theory, and the reason is instructive. The emotions of regret and disappointment are real, and decision makers surely anticipate these emotions when making their choices. The problem is that regret theories make few striking predictions that would distinguish them from prospect theory, which has the advantage of being simpler. The complexity of prospect theory was more acceptable in the competition with expected utility theory because it did predict observations that expected utility theory could not explain.

Richer and more realistic assumptions do not suffice to make a theory successful. Scientists use theories as a bag of working tools, and they will not take on the burden of a heavier bag unless the new tools are very useful. Prospect theory was accepted by many scholars not because it is “true” but because the concepts that it added to utility theory, notably the reference point and loss aversion, were worth the trouble; they yielded new predictions that turned out to be true. We were lucky.

Thinking Fast and Slow, page 288

Kahneman is contrasting three theories of decision making here: the old proposal that people try to maximize their expected utility (roughly, the benefit they get in future), his more complicated “prospect theory” that takes into account not only what benefits people get but their attachment to what they already have, and other more complicated models based on regret. His theory ended up more popular, both than the older theory and than the newer regret-based models.

Why did his theory win out? Apparently, not because it was the true one: as he says, people almost certainly do feel regret, and make decisions based on it. No, his theory won because it was more useful. It made new, surprising predictions, while being simpler and easier to use than the regret-based models.

This, a theory defeating another without being “more true”, might bug you. By itself, it doesn’t bug me. That’s because, as a physicist, I’m used to the idea that models should not just be true, but useful. If we want to test our theories against reality, we have a large number of “levels” of description to choose from. We can “zoom in” to quarks and gluons, or “zoom out” to look at atoms, or molecules, or polymers. We have to decide how much detail to include, and we have real pragmatic reasons for doing so: some details are just too small to measure!

It’s not clear Kahneman’s community was doing this, though. That is, it doesn’t seem like he’s saying that regret and disappointment are just “too small to be measured”. Instead, he’s saying that they don’t seem to predict much differently from prospect theory, and prospect theory is simpler to use.

Ok, we do that in physics too. We like working with simpler theories, when we have a good excuse. We’re just careful about it. When we can, we derive our simpler theories from more complicated ones, carving out complexity and estimating how much of a difference it would have made. Do this carefully, and we can treat black holes as if they were subatomic particles. When we can’t, we have what we call “phenomenological” models, models built up from observation and not from an underlying theory. We never take such models as the last word, though: a phenomenological model is always viewed as temporary, something to bridge a gap while we try to derive it from more basic physics.

Kahneman doesn’t seem to view prospect theory as temporary. It doesn’t sound like anyone is trying to derive it from regret theory, or to make regret theory easier to use, or to prove it always agrees with regret theory. Maybe they are, and Kahneman simply doesn’t think much of their efforts. Either way, it doesn’t sound like a major goal of the field.

That’s the part that bothered me. In physics, we can’t always hope to derive things from a more fundamental theory, some theories are as fundamental as we know. Psychology isn’t like that: any behavior people display has to be caused by what’s going on in their heads. What Kahneman seems to be saying here is that regret theory may well be closer to what’s going on in people’s heads, but he doesn’t care: it isn’t as useful.

And at that point, I have to ask: useful for what?

As a psychologist, isn’t your goal ultimately to answer that question? To find out “what’s going on in people’s heads”? Isn’t every model you build, every theory you propose, dedicated to that question?

And if not, what exactly is it “useful” for?

For technology? It’s true, “Thinking Fast and Slow” describes several groups Kahneman advised, most memorably the IDF. Is the advantage of prospect theory, then, its “usefulness”, that it leads to better advice for the IDF?

I don’t think that’s what Kahneman means, though. When he says “useful”, he doesn’t mean “useful for advice”. He means it’s good for giving researchers ideas, good for getting people talking. He means “useful for designing experiments”. He means “useful for writing papers”.

And this is when things start to sound worryingly familiar. Because if I’m accusing Kahneman’s community of giving up on finding the fundamental truth, just doing whatever they can to write more papers…well, that’s not an uncommon accusation in physics as well. If the people who spend their lives describing cognitive biases are really getting distracted like that, what chance does, say, string theory have?

I don’t know how seriously to take any of this. But it’s lurking there, in the back of my mind, that nasty, vicious, essential question: what are all of our models for?

Bonus quote, for the commenters to have fun with:

I have yet to meet a successful scientist who lacks the ability to exaggerate the importance of what he or she is doing, and I believe that someone who lacks a delusional sense of significance will wilt in the face of repeated experiences of multiple small failures and rare successes, the fate of most researchers.

Thinking Fast and Slow, page 264

# In Life and in Science, Test

Think of a therapist, and you might picture a pipe-smoking Freudian, interrogating you about repressed feelings. These days, you’re more likely to meet a more modern form of therapy, like cognitive behavioral therapy (or CBT for short). CBT focuses on correcting distorted thoughts and maladaptive behaviors: basically, helping you reason through your problems. It’s supposed to be one of the types of therapy that has the most actual scientific evidence behind it.

What impresses me about CBT isn’t just the scientific evidence for it, but the way it tries to teach something like a scientific worldview. If you’re depressed or anxious, a common problem is obsessive thoughts about what others think of you. Maybe you worry that everyone is just putting up with you out of pity, or that you’re hopelessly behind your peers. For many scientists, these are familiar worries.

The standard CBT advice for these worries is as obvious as it is scary: if you worry what others think of you, ask!

This is, at its heart, a very scientific thing to do. If you’re curious about something, and you can test it, just test it! Of course, there are risks to doing this, both in your personal life and in your science, but typical CBT advice applies surprisingly well to both.

If you constantly ask your friends what they think about you, you end up annoying them. Similarly, if you perform the same experiment over and over, you can keep going until you get the result you want. In both cases, the solution is to commit to trusting your initial results: just like scientists pre-registering a study, if you ask your friends what they think you need to trust them and not second-guess what they say. If they say they’re happy with you, trust that. If they criticize, take their criticism seriously and see if you can improve.

Even then, you may be tempted to come up with reasons why you can’t trust what your friends say. You’ll come up with reasons why they might be forced to be polite, while they secretly still hate you. Similarly, as a scientist you can always come up with theories that get around the evidence: no matter what you observe, a complicated enough chain of logic can make it consistent with anything you want. In both cases, the solution is a dose of Occam’s Razor: don’t fixate on an extremely complicated explanation when a simpler one already fits. If your friends say they like you, they probably do.