Batch Invariance

Why Your AI Chatbot Keeps Changing Its Answers

You may have noticed that sometimes, when you give your AI bot a prompt you have used before, the answer it returns is different. This is one of the primary issues that most LLMs are prone to. At their core, these models are non-deterministic. The answers returned are derived from what it calculated was the most likely answer, given a long chain of probabilities around the words in your prompt.

Okay, so if I use the same words in the prompt, it should return the same result, right? Unfortunately, it’s not that simple. New research pinpoints the likely cause of the problem and may offer up a solution.

The problem of differing outputs is caused by “batch invariance.” In basic terms, when an AI server processes prompts, it is doing so for many users. To be efficient, the server groups multiple people’s prompt calculations together in batches. The problem with doing this is that the grouping changes your answer slightly because of how the computer performs decimal math. When adding decimal numbers in different orders, the computer can get slightly different results due to rounding. Since the group size depends on the unpredictable load on the server, the results returned to you can also be unpredictable.

The good news is that the researchers fixed the problem by ensuring the AI performs its calculations in the same way, regardless of the group size. With luck, we may see the major LLM providers adopting this approach within a few iterations of releases.

If you want to read the research paper, you can find it here.