# Shape the Science to the Statistics, Not the Statistics to the Science

In theatre, and more generally in writing, the advice is always to “show, don’t tell”. You could just tell your audience that Long John Silver is a ruthless pirate, but it works a lot better to show him marching a prisoner off the plank. Rather than just informing with words, you want to make things as concrete as possible, with actions.

There is a similar rule in pedagogy. Pedagogy courses teach you to be explicit about your goals, planning a course by writing down Intended Learning Outcomes. (They never seem amused when I ask about the Unintended Learning Outcomes.) At first, you’d want to write down outcomes like “students will understand calculus” or “students will know what a sine is”. These, however, are hard to judge, and thus hard to plan around. Instead, the advice is to write outcomes that correspond to actions you want the students to take, things you want them to be capable of doing: “students can perform integration by parts” “students can decide correctly whether to use a sine or cosine”. Again and again, the best way to get the students to know something is to get them to do something.

Jay Daigle recently finished a series of blog posts on how scientists use statistics to test hypotheses. I recommend it, it’s a great introduction to the concepts scientists use to reason about data, as well as a discussion of how they often misuse those concepts and what they can do better. I have a bit of a different perspective on one of the “takeaways” of the post, and I wanted to highlight that here.

The center of Daigle’s point is a tool, widely used in science, called Neyman-Pearson Hypothesis Testing. Neyman-Pearson is a tool for making decisions involving a threshold for significance: a number that scientists often call a p-value. If you follow the procedure, only acting when you find a p-value below 0.05, then you will only be wrong 5% of the time: specifically, that will be your rate of false positives, the percent of the time you conclude some action works when it really doesn’t.

A core problem, from Daigle’s perspective, is that scientists use Neyman-Pearson for the wrong purpose. Neyman-Pearson is a tool for making decisions, not a test that tells you whether or not a specific claim is true. It tells you “on average, if I approve drugs when their p-value is below 0.05, only 5% of them will fail”. That’s great if you can estimate how bad it is to deny a drug that should be approved, how bad it is to approve a drug that should be denied, and calculate out on average how often you can afford to be wrong. It doesn’t tell you anything about the specific drug, though. It doesn’t tell you “every drug with a p-value below 0.05 works”. It certainly doesn’t tell you “a drug with a p-value of 0.051 almost works” or “a drug with a p-value of 0.001 definitely works”. It just doesn’t give you that information.

In later posts, Daigle suggests better tools, which he argues map better to what scientists want to do, as well as general ways scientists can do better. Section 4. in particular focuses on the idea that one thing scientists need to do is ask better questions. He uses a specific example from cognitive psychology, a study that tests whether describing someone’s face makes you worse at recognizing it later. That’s a clear scientific question, one that can be tested statistically. That doesn’t mean it’s a good question, though. Daigle points out that questions like this have a problem: it isn’t clear what the result actually tells us.

Here’s another example of the same problem. In grad school, I knew a lot of social psychologists. One was researching a phenomenon called extended contact. Extended contact is meant to be a foil to another phenomenon called direct contact, both having to do with our views of other groups. In direct contact, making a friend from another group makes you view that whole group better. In extended contact, making a friend who has a friend from another group makes you view the other group better.

The social psychologist was looking into a concrete-sounding question: which of these phenomena, direct or extended contact, is stronger?

At first, that seems like it has the same problem as Daigle’s example. Suppose one of these effects is larger: what does that mean? Why do we care?

Well, one answer is that these aren’t just phenomena: they’re interventions. If you know one phenomenon is stronger than another, you can use that to persuade people to be more accepting of other groups. The psychologist’s advisor even had a procedure to make people feel like they made a new friend. Armed with that, it’s definitely useful to know whether extended contact or direct contact is better: whichever one is stronger is the one you want to use!

You do need some “theory” behind this, of course. You need to believe that, if a phenomenon is stronger in your psychology lab, it will be stronger wherever you try to apply it in the real world. It probably won’t be stronger every single time, so you need some notion of how much stronger it needs to be. That in turn means you need to estimate costs: what it costs if you pick the weaker one instead, how much money you’re wasting or harm you’re doing.

You’ll notice this is sounding a lot like the requirements I described earlier, for Neyman-Pearson. That’s not accident: as you try to make your science more and more clearly defined, it will get closer and closer to a procedure to make a decision, and that’s exactly what Neyman-Pearson is good for.

So in the end I’m quite a bit more supportive of Neyman-Pearson than Daigle is. That doesn’t mean it isn’t being used wrong: most scientists are using it wrong. Instead of calculating a p-value each time they make a decision, they do it at the end of a paper, misinterpreting it as evidence that one thing or another is “true”. But I think that what these scientists need to do is not chance their statistics, but change their science. If they focused their science on making concrete decisions, they would actually be justified in using Neyman-Pearson…and their science would get a lot better in the process.

# In Defense of Shitty Code

Scientific programming was in the news lately, when doubts were raised about a coronavirus simulation by researchers at Imperial College London. While the doubts appear to have been put to rest, doing so involved digging through some seriously messy code. The whole situation seems to have gotten a lot of people worried. If these people are that bad at coding, why should we trust their science?

I don’t know much about coronavirus simulations, my knowledge there begins and ends with a talk I saw last month. But I know a thing or two about bad scientific code, because I write it. My code is atrocious. And I’ve seen published code that’s worse.

Why do scientists write bad code?

In part, it’s a matter of training. Some scientists have formal coding training, but most don’t. I took two CS courses in college and that was it. Despite that lack of training, we’re expected and encouraged to code. Before I took those courses, I spent a summer working in a particle physics lab, where I was expected to pick up the C++-based interface pretty much on the fly. I don’t think there’s another community out there that has as much reason to code as scientists do, and as little training for it.

Would it be useful for scientists to have more of the tools of a trained coder? Sometimes, yeah. Version control is a big one, I’ve collaborated on papers that used Git and papers that didn’t, and there’s a big difference. There are coding habits that would speed up our work and lead to fewer dead ends, and they’re worth picking up when we have the time.

But there’s a reason we don’t prioritize “proper coding”. It’s because the things we’re trying to do, from a coding perspective, are really easy.

What, code-wise, is a coronavirus simulation? A vector of “people”, really just simple labels, all randomly infecting each other and recovering, with a few parameters describing how likely they are to do so and how long it takes. What do I do, code-wise? Mostly, giant piles of linear algebra.

These are not some sort of cutting-edge programming tasks. These are things people have been able to do since the dawn of computers. These are things that, when you screw them up, become quite obvious quite quickly.

Compared to that, the everyday tasks of software developers, like making a reliable interface for users, or efficient graphics, are much more difficult. They’re tasks that really require good coding practices, that just can’t function without them.

For us, the important part is not the coding itself, but what we’re doing with it. Whatever bugs are in a coronavirus simulation, they will have much less impact than, for example, the way in which the simulation includes superspreaders. Bugs in my code give me obviously wrong answers, bad scientific assumptions are much harder for me to root out.

There’s an exception that proves the rule here, and it’s that, when the coding task is actually difficult, scientists step up and write better code. Scientists who want to run efficiently on supercomputers, who are afraid of numerical error or need to simulate on many scales at once, these people learn how to code properly. The code behind the LHC still might be jury-rigged by industry standards, but it’s light-years better than typical scientific code.

I get the furor around the Imperial group’s code. I get that, when a government makes a critical decision, you hope that their every input is as professional as possible. But without getting too political for this blog, let me just say that whatever your politics are, if any of it is based on science, it comes from code like this. Psychology studies, economic modeling, polling…they’re using code, and it’s jury-rigged to hell. Scientists just have more important things to worry about.

# A Scale of “Sure-Thing-Ness” for Experiments

No experiment is a sure thing. No matter what you do, what you test, what you observe, there’s no guarantee that you find something new. Even if you do your experiment correctly and measure what you planned to measure, nature might not tell you anything interesting.

Still, some experiments are more sure than others. Sometimes you’re almost guaranteed to learn something, even if it wasn’t what you hoped, while other times you just end up back where you started.

The first, and surest, type of experiment, is a voyage into the unknown. When nothing is known about your target, no expectations, and no predictions, then as long as you successfully measure anything you’ll have discovered something new. This can happen if the thing you’re measuring was only recently discovered. If you’re the first person who manages to measure the reaction rates of an element, or the habits of an insect, or the atmosphere of a planet, then you’re guaranteed to find out something you didn’t know before.

If you don’t have a total unknown to measure, then you want to test a clear hypothesis. The best of these are the theory killers, experiments which can decisively falsify an idea. History’s most famous experiments take this form, like the measurement of the perihelion of Mercury to test General Relativity or Pasteur’s tests of spontaneous generation. When you have a specific prediction and not much wiggle room, an experiment can teach you quite a lot.

“Not much wiggle room” is key, because these tests can all to easily become theory modifiers instead. If you can tweak your theory enough, then your experiment might not be able to falsify it. Something similar applies when you have a number of closely related theories. Even if you falsify one, you can just switch to another similar idea. In those cases, testing your theory won’t always teach you as much: you have to get lucky and see something that pins your theory down more precisely.

Finally, you can of course be just looking. Some experiments are just keeping an eye out, in the depths of space or the precision of quantum labs, watching for something unexpected. That kind of experiment might never see anything, and never rule anything out, but they can still sometimes be worthwhile.

There’s some fuzziness to these categories, of course. Often when scientists argue about whether an experiment is worth doing they’re arguing about which category to place it in. Would a new collider be a “voyage into the unknown” (new energy scales we’ve never measured before), a theory killer/modifier (supersymmetry! but which one…) or just “just looking”? Is your theory of cosmology specific enough to be “killed”, or merely “modified”? Is your wacky modification of quantum mechanics something that can be tested, or merely “just looked” for?

For any given experiment, it’s worth keeping in mind what you expect, and what would happen if you’re wrong. In science, we can’t do every experiment we want. We have to focus our resources and try to get results. Even if it’s never a sure thing.

# When You Shouldn’t Listen to a Distinguished but Elderly Scientist

Of science fiction author Arthur C. Clarke’s sayings, the most famous is “Clarke’s third law”, that “Any sufficiently advanced technology is indistinguishable from magic.” Almost as famous, though, is his first law:

“When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.”

Recently Michael Atiyah, an extremely distinguished but also rather elderly mathematician, claimed that something was possible: specifically, he claimed it was possible that he had proved the Riemann hypothesis, one of the longest-standing and most difficult puzzles in mathematics. I won’t go into the details here, but people are, well, skeptical.

This post isn’t really about Atiyah. I’m not close enough to that situation to comment. Instead, it’s about a more general problem.

See, the public seems to mostly agree with Clarke’s law. They trust distinguished, elderly scientists, at least when they’re saying something optimistic. Other scientists know better. We know that scientists are human, that humans age…and that sometimes scientific minds don’t age gracefully.

Some of the time, that means Alzheimer’s, or another form of dementia. Other times, it’s nothing so extreme, just a mind slowing down with age, opinions calcifying and logic getting just a bit more fuzzy.

And the thing is, watching from the sidelines, you aren’t going to know the details. Other scientists in the field will, but this kind of thing is almost never discussed with the wider public. Even here, though specific physicists come to mind as I write this, I’m not going to name them. It feels rude, to point out that kind of all-too-human weakness in someone who accomplished so much. But I think it’s important for the public to keep in mind that these people exist. When an elderly Nobelist claims to have solved a problem that baffles mainstream science, the news won’t tell you they’re mentally ill. All you can do is keep your eyes open, and watch for warning signs:

Be wary of scientists who isolate themselves. Scientists who still actively collaborate and mentor almost never have this kind of problem. There’s a nasty feedback loop when those contacts start to diminish. Being regularly challenged is crucial to test scientific ideas, but it’s also important for mental health, especially in the elderly. As a scientist thinks less clearly, they won’t be able to keep up with their collaborators as much, worsening the situation.

Similarly, beware those famous enough to surround themselves with yes-men. With Nobel prizewinners in particular, many of the worst cases involve someone treated with so much reverence that they forget to question their own ideas. This is especially risky when commenting on an unfamiliar field: often, the Nobelist’s contacts in the new field have a vested interest in holding on to their big-name support, and ignoring signs of mental illness.

Finally, as always, bigger claims require better evidence. If everything someone works on is supposed to revolutionize science as we know it, then likely none of it will. The signs that indicate crackpots apply here as well: heavily invoking historical scientists, emphasis on notation over content, a lack of engagement with the existing literature. Be especially wary if the argument seems easy, deep problems are rarely so simple to solve.

Keep this in mind, and the next time a distinguished but elderly scientist states that something is possible, don’t trust them blindly. Ultimately, we’re still humans beings. We don’t last forever.

By A. Physicist

…because it disagrees with precision electroweak measurements

…………………………………..with bounds from ATLAS and CMS

…………………………………..with the power spectrum of the CMB

…………………………………..with Eötvös experiments

…because it isn’t gauge invariant

………………………….Lorentz invariant

………………………….diffeomorphism invariant

………………………….background-independent, whatever that means

…because it violates unitarity

…………………………………locality

…………………………………causality

…………………………………observer-independence

…………………………………technical naturalness

…………………………………international treaties

…………………………………cosmic censorship

…because you screwed up the calculation

…because you didn’t actually do the calculation

…because I don’t understand the calculation

…because you predict too many magnetic monopoles

……………………………………too many proton decays

……………………………………too many primordial black holes

…………………………………..remnants, at all

…because it’s fine-tuned

…because it’s suspiciously finely-tuned

…because it’s finely tuned to be always outside of experimental bounds

…because you’re misunderstanding quantum mechanics

…………………………………………………………..black holes

………………………………………………………….effective field theory

…………………………………………………………..thermodynamics

…………………………………………………………..the scientific method

…because Condensed Matter would contribute more to Chinese GDP

…because the approximation you’re making is unjustified

…………………………………………………………………………is not valid

…………………………………………………………………………is wildly overoptimistic

………………………………………………………………………….is just kind of lazy

…because there isn’t a plausible UV completion

…because you care too much about the UV

…because it only works in polynomial time

…………………………………………..exponential time

…………………………………………..factorial time

…because even if it’s fast it requires more memory than any computer on Earth

…because it requires more bits of memory than atoms in the visible universe

…because it has no meaningful advantages over current methods

…because it has meaningful advantages over my own methods

…because it can’t just be that easy

…because it’s not the kind of idea that usually works

…because it’s not the kind of idea that usually works in my field

…because it isn’t canonical

…because it’s ugly

…because it’s baroque

…because it ain’t baroque, and thus shouldn’t be fixed

…because only a few people work on it

…because far too many people work on it

…because clearly it will only work for the first case

……………………………………………………………….the first two cases

……………………………………………………………….the first seven cases

……………………………………………………………….the cases you’ve published and no more

…because I know you’re wrong

…because I strongly suspect you’re wrong

…because I strongly suspect you’re wrong, but saying I know you’re wrong looks better on a grant application

…….in a blog post

…because I’m just really pessimistic about something like that ever actually working

…because I’d rather work on my own thing, that I’m much more optimistic about

…because if I’m clear about my reasons

……and what I know

…….and what I don’t

……….then I’ll convince you you’re wrong.

……….or maybe you’ll convince me?

# Where Grants Go on the Ground

I’ve seen several recent debates about grant funding, arguments about whether this or that scientist’s work is “useless” and shouldn’t get funded. Wading into the specifics is a bit more political than I want to get on this blog right now, and if you’re looking for a general defense of basic science there are plenty to choose from. I’d like to focus on a different part, one where I think the sort of people who want to de-fund “useless” research are wildly overoptimistic.

People who call out “useless” research act as if government science funding works in a simple, straightforward way: scientists say what they want to work on, the government chooses which projects it thinks are worth funding, and the scientists the government chooses get paid.

This may be a (rough) picture of how grants are assigned. For big experiments and grants with very specific purposes, it’s reasonably accurate. But for the bulk of grants distributed among individual scientists, it ignores what happens to the money on the ground, after the scientists get it.

The simple fact of the matter is that what a grant is “for” doesn’t have all that much influence on what it gets spent on. In most cases, scientists work on what they want to, and find ways to pay for it.

Sometimes, this means getting grants for applied work, doing some of that, but also fitting in more abstract theoretical projects during downtime. Sometimes this means sharing grant money, if someone has a promising grad student they can’t fund at the moment and needs the extra help. (When I first got research funding as a grad student, I had to talk to the particle physics group’s secretary, and I’m still not 100% sure why.) Sometimes this means being funded to look into something specific and finding a promising spinoff that takes you in an entirely different direction. Sometimes you can get quite far by telling a good story, like a mathematician I know who gets defense funding to study big abstract mathematical systems because some related systems happen to have practical uses.

Is this unethical? Some of it, maybe. But from what I’ve seen of grant applications, it’s understandable.

The problem is that if scientists are too loose with what they spend grant money on, grant agency asks tend to be far too specific. I’ve heard of grants that ask you to give a timeline, over the next five years, of each discovery you’re planning to make. That sort of thing just isn’t possible in science: we can lay out a rough direction to go, but we don’t know what we’ll find.

The end result is a bit like complaints about job interviews, where everyone is expected to say they love the company even though no-one actually does. It creates an environment where everyone has to twist the truth just to keep up with everyone else.

The other thing to keep in mind is that there really isn’t any practical way to enforce any of this. Sure, you can require receipts for equipment and the like, but once you’re paying for scientists’ time you don’t have a good way to monitor how they spend it. The best you can do is have experts around to evaluate the scientists’ output…but if those experts understand enough to do that, they’re going to be part of the scientific community, like grant committees usually already are. They’ll have the same expectations as the scientists, and give similar leeway.

So if you want to kill off some “useless” area of research, you can’t do it by picking and choosing who gets grants for what. There are advocates of more drastic actions of course, trying to kill whole agencies or fields, and that’s beyond the scope of this post. But if you want science funding to keep working the way it does, and just have strong opinions about what scientists should do with it, then calling out “useless” research doesn’t do very much: if the scientists in question think it’s useful, they’ll find a way to keep working on it. You’ve slowed them down, but you’ll still end up paying for research you don’t like.

Final note: The rule against political discussion in the comments is still in effect. For this post, that means no specific accusations of one field or another as being useless, or one politician/political party/ideology or another of being the problem here. Abstract discussions and discussions of how the grant system works should be fine.

# Digging up Variations

The best parts of physics research are when I get a chance to push out into the unknown, doing calculations no-one has done before. Sometimes, though, research is more…archeological.

Pictured: not what I signed up for

Recently, I’ve been digging through a tangle of papers, each of which calculates roughly the same thing in a slightly different way. Like any good archeologist, I need to figure out not just what the authors of these papers were doing, but also why.

(As a physicist, why do I care about “why”? In this case, it’s because I want to know which of the authors’ choices are worth building on. If I can figure out why they made the choices they did, I can decide whether I share their motivations, and thus which aspects of their calculations are useful for mine.)

My first guess at “why” was a deeply cynical one. Why would someone publish slight variations on an old calculation? To get more publications!

This is a real problem in science. In certain countries in particular, promotions and tenure are based not on honestly assessing someone’s work but on quick and dirty calculations based on how many papers they’ve published. This motivates scientists to do the smallest amount possible in order to get a paper out.

That wasn’t what was happening in these papers, though. None of the authors lived in those kinds of countries, and most were pretty well established people: not the sort who worry about keeping up with publications.

So I put aside my cynical first-guess, and actually looked at the papers. Doing that, I found a more optimistic explanation.

These authors were in the process of building research programs. Each had their own long-term goal, a set of concepts and methods they were building towards. And each stopped along the way, to do another variation on this well-trod calculation. They weren’t doing this just because they needed a paper, or just because they could. They were trying to sift out insights, to debug their nascent research program in a well-understood case.

Thinking about it this way helped untwist the tangle of papers. The confusion of different choices suddenly made sense, as the result of different programs with different goals. And in turn, understanding which goals contributed to which papers helped me sort out which goals I shared, and which ideas would turn out to be helpful.

Would it have been less confusing if some of these people had sat on their calculations, and not published? Maybe at first. But in the end, the variations help, giving me a clearer understanding of the whole.

# “Maybe” Isn’t News

It’s been published several places, but you’ve probably seen this headline:

If you’ve been following me for a while, you know where this is going:

No, these physicists haven’t actually shown that the Universe isn’t expanding at an accelerated rate.

What they did show is that the original type of data used to discover that the universe was accelerating back in the 90’s, measurements of supernovae, doesn’t live up to the rigorous standards that we physicists use to evaluate discoveries. We typically only call something a discovery if the evidence is good enough that, in a world where the discovery wasn’t actually true, we’d only have a one in 3.5 million chance of getting the same evidence (“five sigma” evidence). In their paper, Nielsen, Guffanti, and Sarkar argue that looking at a bigger collection of supernovae leads to a hazier picture: the chance that we could get the same evidence in a universe that isn’t accelerating is closer to one in a thousand, giving “three sigma” evidence.

This might sound like statistical quibbling: one in a thousand is still pretty unlikely, after all. But a one in a thousand chance still happens once in a thousand times, and there’s a long history of three sigma evidence turning out to just be random noise. If the discovery of the accelerating universe was new, this would be an important objection, a reason to hold back and wait for more data before announcing a discovery.

The trouble is, the discovery isn’t new. In the twenty years since it was discovered that the universe was accelerating, people have built that discovery into the standard model of cosmology. They’ve used that model to make other predictions, explaining a wide range of other observations. People have built on the discovery, and their success in doing so is its own kind of evidence.

So the objection, that one source of evidence isn’t as strong as people thought, doesn’t kill cosmic acceleration. What it is is a “maybe”, showing that there is at least room in some of the data for a non-accelerating universe.

People publish “maybes” all the time, nothing bad about that. There’s a real debate to be had about how strong the evidence is, and how much it really establishes. (And there are already voices on the other side of that debate.)

But a “maybe” isn’t news. It just isn’t.

Science journalists (and university press offices) have a habit of trying to turn “maybes” into stories. I’ve lost track of the times I’ve seen ideas that were proposed a long time ago (technicolor, MOND, SUSY) get new headlines not for new evidence or new ideas, but just because they haven’t been ruled out yet. “SUSY hasn’t been ruled out yet” is an opinion piece, perhaps a worthwhile one, but it’s no news article.

The thing is, I can understand why journalists do this. So much of science is building on these kinds of “maybes”, working towards the tipping point where a “maybe” becomes a “yes” (or a “no”). And journalists (and university press offices, and to some extent the scientists themselves) can’t just take time off and wait for something legitimately newsworthy. They’ve got pages to fill and careers to advance, they need to say something.

I post once a week. As a consequence, a meaningful fraction of my posts are garbage. I’m sure that if I posted every day, most of my posts would be garbage.

Many science news sites post multiple times a day. They’ve got multiple writers, sure, and wider coverage…but they still don’t have the luxury of skipping a “maybe” when someone hands it to them.

I don’t know if there’s a way out of this. Maybe we need a new model for science journalism, something that doesn’t try to ape the pace of the rest of the news cycle. For the moment, though, it’s publish or perish, and that means lots and lots of “maybes”.

EDIT: More arguments against the paper in question, pointing out that they made some fairly dodgy assumptions.

EDIT: The paper’s authors respond here.

# I Don’t Get Crackpots

[Note: not an April fool’s post. Now I’m wishing I wrote one though.]

After the MHV@30 conference, I spent a few days visiting my sister. I hadn’t seen her in a while, and she noticed something new about me.

“You’re not sure about anything. It’s always ‘I get the impression’ or ‘I believe so’ or ‘that seems good’.”

On reflection, she’s right.

It’s a habit I’ve picked up from spending time around scientists. When you’re surrounded by people who are likely to know more than you do about something, it’s usually good to qualify your statements. A little intellectual humility keeps simple corrections from growing into pointless arguments, and makes it easier to learn from your mistakes.

With that kind of mindset, though, I really really don’t get crackpots.

For example, why do they always wear funnels on their heads?

The thing about genuine crackpots (as opposed to just scientists with weird ideas) is that they tend to have almost none of the relevant background for a given field, but nevertheless have extremely strong opinions about it. That basic first step, of assuming that there are people who probably know a lot more about whatever you’re talking about? Typically, they don’t bother with that. The qualifiers, the “typically” and “as far as I know” just don’t show up. And I have a lot of trouble understanding how a person can work that way.

Is some of it the Dunning-Kruger effect? Sure. If you don’t know much about something, you don’t know the limits of your own knowledge, so you think you know more than you really do. But I don’t think it’s just that…there’s a baseline level of doubt, of humility in general, that just isn’t there for most crackpots.

I wonder if some fraction of crackpots are genuinely mentally ill, but if so I’m not sure what the illness would be. Mania is an ok fit some of the time, and the word salad and “everyone but me is crazy” attitude almost seem schizophrenic, but I doubt either is really what’s going on in most cases.

All of this adds up to me just being completely unable to relate to people who display a sufficient level of crackpottery.

The thing is, there are crackpots out there who I kind of wish I could talk to, because if I could maybe I could help them. There are crackpots who seem genuinely willing to be corrected, to be told what they’re doing wrong. But that core of implicit arrogance, the central assumption that it’s possible to make breakthroughs in a field while knowing almost nothing about it, that’s still there, and it makes it impossible for me to deal with them.

I kind of wish there was a website I could link, dedicated to walking crackpots through their mistakes. There used to be something like that for supernatural crackpots, in the form of the James Randi Educational Foundation‘s Million Dollar Prize, complete with forums where (basically) helpful people would patiently walk applicants through how to set up a test of their claims. There’s never been anything like that for science, as far as I’m aware, and it seems like it would take a lot more work. Still, it would be nice if there were people out there patient enough to do it.

# Science Never Forgets

I’ll just be doing a short post this week, I’ve been busy at a workshop on Flux Tubes here at Perimeter.

If you’ve ever heard someone tell the history of string theory, you’ve probably heard that it was first proposed not as a quantum theory of gravity, but as a way to describe the strong nuclear force. Colliders of the time had discovered particles, called mesons, that seemed to have a key role in the strong nuclear force that held protons and neutrons together. These mesons had an unusual property: the faster they spun, the higher their mass, following a very simple and regular pattern known as a Regge trajectory. Researchers found that they could predict this kind of behavior if, rather than particles, these mesons were short lengths of “string”, and with this discovery they invented string theory.

As it turned out, these early researchers were wrong. Mesons are not lengths of string, rather, they are pairs of quarks. The discovery of quarks explained how the strong force acted on protons and neutrons, each made of three quarks, and it also explained why mesons acted a bit like strings: in each meson, the two quarks are linked by a flux tube, a roughly cylindrical area filled with the gluons that carry the strong nuclear force. So rather than strings, mesons turned out to be more like bolas.

Leonin sold separately.

If you’ve heard this story before, you probably think it’s ancient history. We know about quarks and gluons now, and string theory has moved on to bigger and better things. You might be surprised to hear that at this week’s workshop, several presenters have been talking about modeling flux tubes between quarks in terms of string theory!

The thing is, science never forgets a good idea. String theory was superseded by quarks in describing the strong force, but it was only proposed in the first place because it matched the data fairly well. Now, with string theory-inspired techniques, people are calculating the first corrections to the string-like behavior of these flux tubes, comparing them with simulations of quarks and gluons, and finding surprisingly good agreement!

Science isn’t a linear story, where the past falls away to the shiny new theories of the future. It’s a marketplace. Some ideas are traded more widely, some less…but if a product works, even only sometimes, chances are someone out there will have a reason to buy it.