Tag Archives: DoingScience

Numerics, or, Why can’t you just tell the computer to do it?

When most people think of math, they think of the math they did in school: repeated arithmetic until your brain goes numb, followed by basic algebra and trig. You weren’t allowed to use calculators on most tests for the simple reason that almost everything you did could be done by a calculator in a fraction of the time.

Real math isn’t like that. Mathematicians handle proofs and abstract concepts, definitions and constructions and functions and generally not a single actual number in sight. That much, at least, shouldn’t be surprising.

What might be surprising is that even tasks which seem very much like things computers could do easily take a fair bit of human ingenuity.

In physics, I do a lot of integrals. For those of you unfamiliar with calculus, integrals can be thought of as the area between a curve and the x-axis.

Areas seem like the sort of thing it would be easy for a computer to find. Chop the space into little rectangles, add up all the rectangles under the curve, and if your rectangles are small enough you should get the right answer. Broadly, this is the method of numerical integration. Since computers can do billions of calculations per second, you can chop things up into billions of rectangles and get as close as you’d like, right?

Heck, ten is a lot. Can we just do ten?

For some curves, this works fine. For others, though…

Ten might not be enough for this one.

See how the left side of that plot goes off the chart? That curve goes to infinity. No matter how many rectangles you put on that side, you still won’t have any that are infinitely tall, so you’ll still miss that part of the curve.

Surprisingly enough, the area under this curve isn’t infinite. Do the integral correctly, and you get a result of 2. Set a computer to calculate this integral via the sort of naïve numerical integration discussed above though, and you’ll never find that answer. You need smarter methods: smart humans doing the math, or smart humans programming the computer.

Another way this can come up is if you’re adding up two parts of something that go to infinity in opposite directions. Try to integrate each part by itself and you’ll be stuck.

firstplot

But add them together, and you get something quite a bit more tractable.

Yeah, definitely a ten-rectangle job.

Numerical integration, and computers in general, are a very important tool in a scientist’s arsenal. But in order to use them, we have to be smart, and know what we’re doing. Knowing how to use our tools right can take almost as much expertise and care as working without tools.

So no, I can’t just tell the computer to do it.

“Super” Computers: Using a Cluster

5 Replies

When I join a new department or institute, the first thing I ask is “do we have a cluster?”

Most of what I do, I do on a computer. Gone are the days when theorists would always do all their work on notepads and chalkboards (though many still do!). Instead, we use specialized computer programs like Mathematica and Maple. Using a program helps keep us from forgetting pesky minus signs, and it allows working with equations far too long to fit on a sheet of paper.

Now if computers help, more computer should help more. Since physicists like to add “super” to things, what about a supercomputer?

The Jaguars of the computing world.

Supercomputers are great, but they’re also expensive. The people who use supercomputers are the ones who model large, complicated systems, like the weather, or supernovae. For most theorists, you still want power, but you don’t need quite that much. That’s where computer clusters come in.

A computer cluster is pretty much what it sounds like: several computers wired together. Different clusters contain different numbers of computers. For example, my department has a ten-node cluster. Sure, that doesn’t stack up to a supercomputer, but it’s still ten times as fast as an ordinary computer, right?

The power of ten computers!

Well, not exactly. As several of my friends have been surprised to learn, the computers on our cluster are actually slower than most of our laptops.

The power of ten old computers!

Still, ten older computers is still faster than one new one, yes?

Even then, it depends how you use it.

Run a normal task on a cluster, and it’s just going to run on one of the computers, which, as I’ve said, are slower than a modern laptop. You need to get smarter.

There are two big advantages of clusters: time, and parallelization.

Sometimes, you want to do a calculation that will take a long time. Your computer is going to be busy for a day or two, and that’s inconvenient when you want to do…well, pretty much anything else. A cluster is a space to run those long calculations. You put the calculation on one of the nodes, you go back to doing your work, and you check back in a day or two to see if it’s finished.

Clusters are at their most powerful when you can parallelize. If you need to do ten versions of the same calculation, each slightly different, then rather than doing them one at a time a cluster lets you do them all at once. At that point, it really is making you ten times faster.

If you ever program, I’d encourage you to look into the resources you have available. A cluster is a very handy thing to have access to, no matter what you’re doing!

Why we Physics

The Royal We of Theoretical Physics

2 Replies

I’m about to show you an abstract from a theoretical physics paper. Don’t worry about what it says, just observe the grammar.

wittenabstract

Notice anything? Here, I’ll zoom in:

wittenwe

This paper has one author, Edward Witten. So who’s “we”?

As it turns out, it is actually quite common in theoretical physics for a paper to use the word “we”, even when it is written by a single author. While this tradition has been called stilted, pompous, and just plain bad writing, there is a legitimate reason behind it. “We” is convenient, because it represents several different important things.

While the paper I quoted was written by only one author, many papers are collaborative efforts. For a collaboration, depending on collaboration style, it is often hard to distinguish who did what in a consistent way. As such, “we” helps smooth over different collaboration styles in a consistent way.

What about single-authored papers, though? For a single author, and often even for multiple authors, “we” means the author plus the reader.

In principle, anyone reading a paper in theoretical physics should be able to follow along, doing the calculations on their own, and replicate the paper’s results. In practice this can often be difficult to impossible, but it’s still true that if you want to really retain what you read in theoretical physics, you need to follow along and do some of the calculation yourself. As a nod to this, it is conventional to write theoretical physics papers as if the reader was directly participating, leading them through the results point by point like exercises in a textbook. “We” do one calculation, then “we” use the result to derive the next point, and so on.

There are other meanings that “we” can occasionally serve, such as referring to everyone in a particular field, or a group in a hypothetical example.

While each of these meanings of “we” could potentially use a different word, that tends to make a paper feel cluttered, with jarring transitions between different subjects. Using “we” for everything gives the paper a consistent voice and feel, though it does come at the cost of obscuring some of the specific details of who did what. Especially for collaborations, the “we the collaborators” and “we the author plus reader” meanings can overlap and blur together. This usually isn’t a problem, but as I’ve been finding out recently it does make things tricky when writing for people who aren’t theoretical physicists, such as universities with guidelines that require a thesis to clearly specify who in a collaboration did what.

On an unrelated note, two papers went up this week pushing the hexagon function story to new and impressive heights. I wasn’t directly involved in either, I’ve been attacking a somewhat different part of the problem, and you can look forward to something on that in a few months.

What’s in a Thesis?

Four Gravitons and a…Postdoc?

9 Replies

As a few of you already know, it’s looking increasingly certain that I will be receiving my Ph.D. in the spring. I’ll graduate, ceasing to be a grad student and becoming that most mysterious of academic entities, a postdoc.

When describing graduate school before, I compared it to an apprenticeship. (I expanded on that analogy more here.) Let’s keep pursuing that analogy. If a graduate student is like an apprentice, then a Postdoctoral Scholar, or Postdoc, is like a journeyman.

In Medieval Europe, once an apprenticeship was completed the apprentice was permitted to work independently, earning a wage for their own labors. However, they still would not have their own shop. Instead, they would work for a master craftsman. Such a person was called a journeyman, after the French work journée, meaning a day’s work.

Similarly, once a graduate student gets their Ph.D., they are able to do scientific research independently. However, most graduate students are not ready to be professors when fresh out of their Ph.D. Instead, they become postdocs, working in an established professor’s group. Like a journeyman, a postdoc is nominally independent, but in practice works under loose supervision from the more mature members of their field.

Another similarity between postdocs and journeymen is their tendency to travel. Historically, a journeyman would spend several years traveling, studying in the workshops of several masters. Similarly, a postdoc will often (especially in today’s interconnected world) travel far from where they began in order to broaden their capabilities.

A postdoctoral job generally lasts two or three years, one for particularly short positions. Most scientists will go through at least one postdoctoral position after achieving their Ph.D. In some fields (theoretical physics in particular), a scientist will have two or three such positions in different places before finding a job as a professor. Postdocs are paid significantly better than grad students, but generally significantly worse than professors. They don’t (typically) teach, but depending on the institution and field they may do some TA work.

Being still a grad student, my blog is titled “4 gravitons and a grad student”. That could change, though. Once I become a postdoc, I have three options:

Keep the old title. Keeping the same title and domain name makes it easier for people to find the blog. It also maintains the alliteration, which I think is fun. On the other hand, it would be hard to justify, and I’d likely have to write something silly about taking a grad student perspective or the like.
Change to “4 gravitons and a postdoc”. I’d lose the fun alliteration, but the title would accurately represent my current state. However, I might lose a few readers who don’t expect the change.
Cut it down to “4 gravitons”. This matches the blog’s twitter handle (@4gravitons). It’s quick, it’s recognizable, and it keeps the memorable part of the old title without adding anything new to remember. However, it would be less unique in google searches.

What do you folks think? I’ve still got a while to decide, and I’d love to hear your opinions!

The Amplitudes Revolution Will Not Be Televised (But It Will Be Streamed)

Elegance, Not So Mysterious

2 Replies

You’ll often hear theoretical physicists in the media referring to one theory or another as “elegant”. String theory in particular seems to get this moniker fairly frequently.

It may often seem like mathematical elegance is some sort of mysterious sixth sense theorists possess, as inexplicable to the average person as color to a blind person. What’s “elegant” about string theory, after all?

Before explaining elegance, I should take a bit of time to say what it’s not. Elegance isn’t Occam’s razor. It isn’t naturalness, either. Both of those concepts have their own technical definitions.

Elegance, by contrast, is a much hazier, and yet much simpler, notion. It’s hazy enough that any definition could provoke arguments, but I can at least give you an approximate idea by telling you that an elegant theory is simple to describe, if you know the right terms. Often, it is simpler than the phenomenon that it explains.

How does this apply to something like string theory? String theory seems to be incredibly complicated: ten dimensions, curled up in a truly vast number of different ways, giving rise to whole spectrums of particles.

That said, the basic idea is quite simple. String theory asks the question: what if, in addition to fundamental point-particles (zero dimensional objects), there were fundamental objects of other dimensions? That idea leads to complicated consequences: if your theory is going to produce all the particles of the real world then you need the ten dimensions and the supersymmetry and yadda yadda. But the basic idea is simple to describe. An elegant theory can have very complicated consequences, but still be simple to describe.

This, broadly, is the sort of explanation theoretical physicists look for. Math is the kind of field where the same basic systems can describe very complex phenomena. Since theoretical physics is about describing the world in terms of math, the right explanation is usually the most elegant.

This can occasionally trip physicists up when they migrate to other careers. In biology, for example, the elegant solution is often not the right one, because evolution doesn’t care about elegance: evolution just grabs whatever is within reach. Financial systems and economics occasionally have similar problems. All this is to say that while elegance is an important thing for a physicist to strive for, sometimes we have to be careful about it.

Where are the Amplitudeologists?

1 Reply

As I’ve mentioned a couple of times before, I’m part of a sub-field of theoretical physics called Amplitudeology.

Amplitudeology in its modern incarnation is relatively new, and concentrated in a few specific centers. I thought it might be interesting to visualize which universities have amplitudeologists, so I took a look at the attendee lists of two recent conferences and put their affiliations into google maps. In an attempt to balance things, one of the conferences is in North America and the other is in Europe. Here is the result:

The West Coast of the US has two major centers, Stanford/SLAC and UCLA, focused around Lance Dixon and Zvi Bern respectively. The Northeast has a fair assortment, including places that have essentially everything like the Perimeter Institute and the Institute for Advanced Study and places known especially for their amplitudes work like Brown.

Europe has quite a large number of places. There are many universities in Europe with a long history of technical research into quantum field theory. When amplitudes began to become more prominent as its own sub-field, many of these places slotted right in. In particular, there are many locations in Germany, a decent number in the UK, a few in the vicinity of CERN, and a variety of places of some importance elsewhere.

Outside of Europe and North America, there’s much less amplitudes research going on. Physics in general is a very international enterprise, and many sub-fields have a lot of participation from researchers in China, India, Japan, and Korea. Amplitudes, for the most part, hasn’t caught on in those places yet.

This map is just a result of looking at two conferences. More data would yield many places that were left out of this setup, including a longstanding community in Russia. Still, it gives you a rough idea of where to find amplitudeologists, should you have need of one.

High Energy? What does that mean?

4 Replies

I am a high energy physicist who uses the high energy and low energy limits of a theory that, while valid up to high energies, is also a low-energy description of what at high energies ends up being string theory (string theorists, of course, being high energy physicists as well).

If all of that makes no sense to you, congratulations, you’ve stumbled upon one of the worst-kept secrets of theoretical physics: we really could use a thesaurus.

“High energy” means different things in different parts of physics. In general, “high” versus “low” energy classifies what sort of physics you look at. “High” energy physics corresponds to the very small, while “low” energies encompass larger structures. Many people explain this via quantum mechanics: the uncertainty principle says that the more certain you are of a particle’s position, the less certain you can be of how fast it is going, which would imply that a particle that is highly restricted in location might have very high energy. You can also understand it without quantum mechanics, though: if two things are held close together, it generally has to be by a powerful force, so the bond between them will contain more energy. Another perspective is in terms of light. Physicists will occasionally use “IR”, or infrared, to mean “low energy” and “UV”, or ultraviolet, to mean “high energy”. Infrared light has long wavelengths and low energy photons, while ultraviolet light has short wavelengths and high energy photons, so the analogy is apt. However, the analogy only goes so far, since “UV physics” is often at energies much greater than those of UV light (and the same sort of situation applies for IR).

So what does “low energy” or “high energy” mean? Well…

The IR limit: Lowest of the “low energy” points, this refers to the limit of infinitely low energy. While you might compare it to “absolute zero”, really it just refers to energy that’s so low that compared to the other energies you’re calculating with it might as well be zero. This is the “low energy limit” I mentioned in the opening sentence.

Low energy physics: Not “high energy physics”. Low energy physics covers everything from absolute zero up to atoms. Once you get up to high enough energy to break up the nucleus of an atom, you enter…

High energy physics: Also known as “particle physics”, high energy physics refers to the study of the subatomic realm, which also includes objects which aren’t technically particles like strings and “branes”. If you exclude nuclear physics itself, high energy physics generally refers to energies of a mega-electron-volt and up. For comparison, the electrons in atoms are bound by energies of around an electron-volt, which is the characteristic energy of chemistry, so high energy physics is at least a million times more energetic. That said, high energy physicists are often interested in low energy consequences of their theories, including all the way down to the IR limit. Interestingly, by this point we’ve already passed both infrared light (from a thousandth of an electron-volt to a single electron volt) and ultraviolet light (several electron-volts to a hundred or so). Compared to UV light, mega-electron volt scale physics is quite high energy.

The TeV scale: If you’re operating a collider though, mega-electron-volts (or MeV) are low-energy physics. Often, calculations for colliders will assume that quarks, whose masses are around the MeV scale, actually have no mass at all! Instead, high energy for particle colliders means giga (billion) or tera (trillion) electron volt processes. The LHC, for example, operates at around 7 TeV now, with 14 TeV planned. This is the range of scales where many had hoped to see supersymmetry, but as time has gone on results have pushed speculation up to higher and higher energies. Of course, these are all still low energy from the perspective of…

The string scale: Strings are flexible, but under enormous tension that keeps them very very short. Typically, strings are posed to be of length close to the Planck length, the characteristic length at which quantum effects become relevant for gravity. This enormously small length corresponds to the enormously large Planck energy, which is on the order of 10²⁸ electron-volts. That’s about ten to the sixteen times the energies of the particles at the LHC, or ten to the twenty-two times the MeV scale that I called “high energy” earlier. For comparison, there are about ten to the twenty-two atoms in a milliliter of water. When extra dimensions in string theory are curled up, they’re usually curled up at this scale. This means that from a string theory perspective, going to the TeV scale means ignoring the high energy physics and focusing on low energy consequences, which is why even the highest mass supersymmetric particles are thought of as low energy physics when approached from string theory.

The UV limit: Much as the IR limit is that of infinitely low energy, the UV limit is the formal limit of infinitely high energy. Again, it’s not so much an actual destination, as a comparative point where the energy you’re considering is much higher than the energy of anything else in your calculation.

These are the definitions of “high energy” and “low energy”, “UV” and “IR” that one encounters most often in theoretical particle physics and string theory. Other parts of physics have their own idea of what constitutes high or low energy, and I encourage you to ask people who study those parts of physics if you’re curious.