One of the most significant downsides to using generative AI is the fact that LLMs sometimes present hallucinations in their output. You’ve probably run into them many times. Essentially, they are incorrect answers that sound correct and confident. The answers are sometimes supported by either fabricated information or accurate information that is misused.
Earlier this year, Anthropic released some research that touched on hallucinations, but it only seemed to scratch the surface. Now, enter recent research from OpenAI. Their study finds that a primary cause of hallucinations is related to how LLMs are trained.
In simplified terms, LLMs undergo a training process where they are rewarded with positive points added to a score when they respond with a correct answer. If the model does not provide the correct answer, they typically receive no points or a slight deduction. But what about when a model guesses and makes up an answer? If the model guesses, it might get the answer correct and therefore earn points. If it guesses wrong, then there is no additional penalty for an incorrect answer. Since there is no extra penalty for a guess, the training is essentially “teaching” the model to guess.
If this is a problem that can be mitigated by revising the training process, then that is fantastic news. The challenge will be to see just how much the problem of hallucinations can be reduced. I highly doubt the problem can be entirely eliminated, but any reduction is a significant improvement.
You can check out the research paper here.

