What AI Physicists Are Missing and What They Aren’t

I’ve seen a couple more thoughtful takes on use of LLMs for physics lately. This blog post by Minas Karamis is particularly nice.

He points out something that I’ve said a version of: an AI that must be supervised like a student isn’t very useful, because the main point of student projects isn’t the paper at the end: it’s training the student. If students don’t struggle through all the mistakes of a project, they won’t get the expertise to one day do greater things.

Someone might object that not all suffering is educational. In the 1700’s, Leonhard Euler calculated digit after digit of transcendental numbers by hand. Nobody asks students to do that anymore, and they still seem to turn out alright. Why would using an LLM for science be worse than using a computer for numerical calculations?

In a word: different skills. Programming numerics teaches you some of the same skills as calculating the numbers by hand: skills at being specific about what you mean, aware of the consequences of the details and their implications. Prompting an AI still requires those skills, to check whether the AI’s output is correct. But it’s much worse at teaching them: unlike programming or calculating, when prompting AI, the consequences of your actions aren’t predictable.

For some, though, there is another objection. Sure, using AI reliably might require those skills now. But when it gets better, surely being careful will stop mattering. Surely the AI will end up doing science on its own, and all that training will be as useful as if we trained the students to play football.

I’m skeptical, but not as strongly as some. I think we’re still living in a time when it makes sense to hire scientists, and train people to think, and invest in your retirement.

I don’t think I have any knock-down arguments for that, though. Just some suggestive ones.

One I’ve talked about before is that a lot of the most important parts of thinking aren’t written down. An AI physicist is going to have a hard time replicating the kinds of methods and approaches that people use behind the scenes, but rarely describe or spell out. It will be easier to suss this out over time, as more data accumulates of people working with LLMs and correcting them. But ultimately there isn’t going to be a lot of documentation of this kind of thing.

Another limitation is memory. A mature scientist can draw from experiences across their entire career. For an LLM, any problem it’s solved in the past is by default lost in each new session. People build structures around this, taking notes and reminding the AI when it “wakes up”, or making documents the AI can be prompted to check. But nothing in this vein so far seems to get nearly as wide-scope or powerful as human memory. A scientist career is still the best way we have to build durable, functional expertise.

Finally, there is a question of costs, and efficiency. Here I’m not an expert, and I get the impression the actual experts disagree. I don’t know whether we should expect scaling to hit a wall, but I wouldn’t be that surprised if it did.

There are other common reasons for skepticism that seem more dubious to me. I don’t think AI is inherently worse at creativity just because they’re trained on existing work, though some of the skills we associate with creativity aren’t very well-documented, and thus are hard to train for. I don’t think AI’s randomness or unreliability is a deal-breaker, because human intuition is also random and unreliable: we solve that with tools, and that’s something AI can in principle do as well. I don’t think humans are “more agentic” or something, except in the sense that most AIs are made by companies who need to make them behave in a customer-friendly way. But an agent is just a game-theoretic construct, a way to figure out can win or lose in situations with defined stakes, and anything you can train or engineer to try to win can be modeled by that construct.

Coming from a place of uncertainty, my main appeal to you is to not get hung up on the bad reasons, either yourself, or from the people you’re arguing with. Focus on the best arguments, and see where they take you.

4 thoughts on “What AI Physicists Are Missing and What They Aren’t

  1. boldly91f5a7d879's avatarboldly91f5a7d879

    In much of science, the answer isn’t as important as the path you took to get there, because once you know the path, you can find many answers. Particularly when it comes to visualizing models and methods, overcoming obstacles to understanding requires confronting them oneself, vicarious experience even with the best of teachers is never a full substitute.

    Like

    Reply
  2. JollyJoker's avatarJollyJoker

    My pet peeve in LLM coding is people coming up with all sorts of contrived “memory” gadgets that let the agent search a database of previous chats, when what they’re missing is completely standard best practices with automated tests, clear structure, reusable code, consistent naming and documentation that lets both human and agent jump right into what they need done.

    I wouldn’t be surprised if agentic sciencing ends up building similar structures with standardized libraries of text snippets that explain how to do this or that, hierarchical so you never need to go into more detail than you need for a given task.

    The main reason programming concepts should, imo, be applicable is that they’re already used to automate complicated tasks in ways that are as easy as possible to understand, maintain and build on.

    Like

    Reply
  3. JollyJoker's avatarJollyJoker

    “But the real threat isn’t either of those things. It’s quieter, and more boring, and therefore more dangerous. The real threat is a slow, comfortable drift toward not understanding what you’re doing. Not a dramatic collapse. Not Skynet. Just a generation of researchers who can produce results but can’t produce understanding. Who know what buttons to press but not why those buttons exist. Who can get a paper through peer review but can’t sit in a room with a colleague and explain, from the ground up, why the third term in their expansion has the sign that it does.”

    Parts of Minas Karamis’ post read like AI slop. Not x. Not y. Just a z of blahblah.

    Like

    Reply
    1. 4gravitons's avatar4gravitons Post author

      Look, as people keep commenting, there’s a reason why AI has these tendencies. They’ve been common in certain internet writing genres for a long time.

      Like

      Reply

Leave a comment! If it's your first time, it will go into moderation.