ArXiv Will Ban You for Hallucinated References

Thomas Dietterich, Chair of the Computer Science section of the preprint server arXiv.org, recently clarified the site’s policies towards “hallucinated” citations and other signs of careless use of AI in a post on X. If your paper contains a citation to a paper that doesn’t actually exist, or has other signs you didn’t read it before posting like leftover commentary (the example he gave was “here is a 200 word summary; would you like me to make any changes?”), then you can get banned from the arXiv for one year. Even after that year you’d be on a kind of “probation”, and would need to show that your next few papers had been accepted by peer-reviewed journals first before posting them.

At the risk of saying the obvious, this is a good idea! arXiv isn’t peer review, it isn’t meant to judge the value of the papers it hosts. But it still needs to be a useful place for scientists to post their papers, which is why they try to keep spam and irrelevant content to a minimum. If you don’t actually endorse the content of a paper, you shouldn’t post it in the first place.

That said, the whole existence of hallucinated citations on arXiv feels a little silly. It makes sense for academic journals and preprint servers in other fields. But arXiv was the first site of its kind for a reason. Its users, physicists, mathematicians, and computer scientists, don’t need much hand-holding when it comes to computers. Papers submitted to arXiv aren’t typically written in Word, they’re written in a document-writing language called LaTeX, that lets users make decently-formatted papers without help from a journal. Physicist-written code may be terrible by any reasonable criteria…but it exists, much more universally than for example biologist-written code.

This extends to citations. In my old field, there is a database called INSPIRE that updates automatically from arXiv. Click on a paper, and a handy “cite” link gives you standardized citations in several formats, ready to copy and paste into your LaTeX code. Nearly every citation in my papers is copied from there. The ones that aren’t are either from other fields where I didn’t know of that style of database, or things that haven’t been published (this can be manuscripts in preparation, or personal communications).

All of this, though, feels like a lot less than what the field could be doing. In a world where almost everyone posts their papers to the same website, and almost everyone has at least a rudimentary understanding of programming…why are people still writing citations in free-form text in the first place? Why aren’t citations built in to the submitted papers on arXiv, automatically linked to the papers they cite? Why don’t we have a setup where, except for a small number of “special” citations, every citation is built so that it automatically goes to a real paper, and gives a clear error message if it doesn’t? In short, why are hallucinated citations even possible?

Look, I’m naive, I get that. I believe in automation, not in the modern context of LLMs and other heuristics, but in setting clear procedures and building clear rules. The world doesn’t work that way! The clear rules are always more contentious than you expect, the fuzzy human-led version always the only choice people can agree on.

But still. Citations. There has to be a better system, right?

7 thoughts on “ArXiv Will Ban You for Hallucinated References”

Sylvain Ribault May 22, 2026 at 6:21 pm

The system you are asking for looks very much like Wikipedia’s Cite Q:

https://diff.wikimedia.org/2021/01/14/automatically-maintained-citations-with-wikidata-and-cite-q/

This system exists but its use in Wikipedia is far from universal or even dominant. When you need to add a citation in a page, it is easier to write it as text in that page, than to create a Wikidata entry and then use Cite Q — except if the entry already exists.

LikeLiked by 1 person

Reply ↓

Minas Karamanis May 22, 2026 at 6:38 pm

The policy itself is fine. If you put your name on a paper, you’re responsible for what’s in it, including the references. That’s not a new principle and it shouldn’t be controversial.

What I’m less confident about is uniform enforcement. Hallucinated citations are a clear-cut case, sure. But “other signs of careless AI use” is vague enough that it’ll get applied selectively. And the selection won’t be random.

It’s worth remembering that AI detection tools are, to put it generously, not great. False positive rates are high enough that non-native English speakers, people who write in a formal register, and anyone whose prose happens to be “too clean” get flagged routinely. These tools are already causing real damage in universities, where students have been wrongly accused of cheating on the basis of a confidence score that means less than it claims to. Importing that same logic into arXiv moderation, even informally, even just as one input among many, is a recipe for the same problems at a larger scale.

The people most exposed to those false positives are the same people least equipped to contest them. Junior researchers, people without institutional backing, non-native speakers who already face extra scrutiny on their writing. A well-known professor at a major department who submits something sloppy gets an email and a second chance. A postdoc from a less-connected institution gets a year-long ban and a permanent mark on their record. Not because anyone intended the disparity, but because that’s how discretionary enforcement always works.

I agree there should be a better system for citations (especially given the current LLM, sorry, AI chatbot circus). But I’d feel better about the punishment regime if I thought the people designing it had spent much time thinking about who actually gets punished.

LikeLike

Reply ↓

4gravitons Post authorMay 22, 2026 at 6:49 pm

The post seemed fairly clear that they’re not going to be policing style or other kinds of things an AI detector can identify, just really obviously dumb things like meta-comments left in the text. While I agree with you that this won’t be implemented uniformly (much like earlier things they’ve banned for), it doesn’t sound like any of this is the kind of thing you can trigger accidentally if you aren’t directly copy-pasting from an LLM.

LikeLiked by 2 people

Reply ↓
1. Minas Karamanis May 22, 2026 at 7:02 pm
  
  A significant fraction of the people I know, already use Claude Code directly to write their papers, not even copy-pasting from an LLM. This is going to become more, not less, common.
  
  LikeLike
  
  Reply ↓
masharpe May 22, 2026 at 7:46 pm

Not because anyone intended the disparity, but because that’s how discretionary enforcement always works.

This phrase struck me as characteristic, so I copy-pasted the whole comment into Pangram, and sure enough it estimates “100% of this text is AI Generated”.

I don’t mean this as an accusation; I just find it amusing. If incorrect, it illustrates your point that human-written text can be falsely flagged as AI-written by detectors. If correct, that’s funny too.

LikeLike

Reply ↓
1. Minas Karamanis May 22, 2026 at 8:43 pm
  
  No, but now I kind of wish I had. Missed opportunity for the perfect demonstration.
  
  Fun fact: I ran both our comments through ZeroGPT, and yours actually scores a higher probability of being AI-generated than mine. So either these tools are useless, or we’re both bots. Take your pick.
  
  LikeLiked by 1 person
  
  Reply ↓
  1. Boxo McFoxo May 26, 2026 at 2:36 am
    
    Well, of course it did. ZeroGPT isn’t trained on Claude. Claudiform text is quite different from GPTese. Since Claude is your liebox of choice, it’s naturally not going to score highly on ZeroGPT.
    
    But yes, in all seriousness, ‘AI’ detectors are unreliable, and they do flag ESL and neurodivergent writers for being ‘high perplexity’. What I find interesting is that some instructors will completely trust the ‘AI’ detector confidence score, while they did not blindly trust plagiarism detectors.
    
    I think it is because they have a mental model of what plagiarism is, so they understand how a plagiarism detector can generate a false positive, but this ‘AI’ output that can trick people into seeing a mind that is not there is mysterious to them.
    
    It is sometimes, though not always, possible to tell with a critical application of theory of mind, though. There is this false dichotomy between the unreliable ‘AI’ detectors and a human reading that can only go vibes-level deep. It is a kind of cognitive surrender that plays into the hands of advocates for the liebox.
    
    Sometimes you can just tell from the text alone, that the reason for text being off is that it came from a synthetic text extrusion machine with no conceptual world model. With theory of mind. With critical thinking.
    
    Part of the problem, of course, is that there is a very powerful industry currently obscuring the plain reality that LLMs do not have a conceptual world model, so it does not occur to people that this can be used to interrogate their output. They think it is like a quirky digital mind and that the quirks are its tells. The tells actually come from it not being a mind at all.
    
    LikeLike
    
    Reply ↓

4 gravitons

Stories about physics from someone who's been there

ArXiv Will Ban You for Hallucinated References

7 thoughts on “ArXiv Will Ban You for Hallucinated References”

Leave a comment! If it's your first time, it will go into moderation. Cancel reply

Share this:

Related

7 thoughts on “ArXiv Will Ban You for Hallucinated References”

Leave a comment! If it's your first time, it will go into moderation. Cancel reply