The Original Sin of Intelligence: Why Neither Humans Nor AI Can Ever Be Truly Deterministic

Introduction

If you ask a calculator what 1 + 1 is, it will tell you 2. If you ask it a million times, it will tell you 2 a million times. We expect computers to be logic engines — perfect, repeatable, and deterministic.

When we build Large Language Models (LLMs), we assume they follow the same rules. If we set the "Temperature" to 0 (telling the model to always pick the most likely next word), we expect the exact same essay every time we run the prompt.

But if you have worked with LLMs in production, you know the dirty secret: They don't.

Even with the randomness turned off, LLMs often give different answers to the same question depending on when you ask them. Why? The answer lies deep in the "physics" of computer chips, and it offers a fascinating mirror to the "physics" of the human mind.

The Butterfly Effect in the GPU

To understand why AI changes its mind, we have to look at the atomic unit of AI labor: The Kernel.

A "Kernel" is a small, highly optimized program running on a GPU (Graphics Processing Unit) that performs a specific mathematical task, like multiplying matrices. (I am sure that readers are mostly with CS background. I know Kernel is not just at GPU's.)

In a recent deep-dive, Horace He and the Thinking Machines Lab uncovered the root cause of AI non-determinism. It isn't a ghost in the machine; it's a conflict between math and speed.

The "Chef" Analogy

Imagine a chef (the Kernel) making a soup.

Scenario A: He is cooking for just one customer. He chops carrots, then onions, then celery, and throws them in the pot.

Scenario B: He is cooking for 100 customers at once (a "Batch"). To be faster, he changes his strategy. He chops 100 carrots all at once, then 100 onions.

The ingredients are the same. The amounts are the same. But the order in which they were added to the pot changed.

In the world of pure mathematics, A + B + C is exactly the same as A + (B + C). But in the physical world of GPU silicon, this isn't true. Computers use Floating Point Arithmetic, which has limited precision. Tiny rounding errors occur depending on the order in which you add numbers.

(0.1 + 0.2) + 0.3 ≠ 0.1 + (0.2 + 0.3)

This is the "Original Sin" of digital intelligence. When an LLM server is busy (high batch size), the Kernels switch strategies to be more efficient. This changes the order of addition, which introduces a microscopic rounding error.

That error ripples through the billions of parameters in the model. Eventually, it flips a 49.9% probability token to a 50.1% probability token. The model chooses a different word. Once that word changes, the context for the next word changes, and suddenly, the whole essay is different.

The Verdict: LLM non-determinism is an engineering artifact. It can be fixed (by forcing the Kernel to calculate the exact same way regardless of batch size), but usually, we don't fix it because we prefer speed over perfect repetition.

The Human Kernel: Solvable vs Fundamental Noise

If the most precise logic machines we have ever built (GPUs) suffer from "noise" due to their physical substrate, what does that imply for the human brain?

We are, after all, neural networks running on hardware. But our hardware isn't made of silicon; it's made of wet biology.

If we define the "Biological Kernel" as the moment a synapse fires or an ion channel opens, we see a striking parallel.

The AI Kernel suffers from Floating Point Noise. It is "deterministic chaos" — if we could control every electron and every scheduler perfectly, we could predict the outcome.
The Human Kernel suffers from Quantum and Thermal Noise.

Biological neurons operate at a scale where Brownian motion (thermal noise) and quantum indeterminacy matter. Whether a neurotransmitter vesicle releases or not isn't just a matter of complex inputs; it is subject to the fundamental randomness of the universe.

Conclusion: Embracing the Noise

We spend a lot of energy trying to "defeat" non-determinism in LLMs. We want them to be reliable tools. As the Thinking Machines research shows, we can achieve this by enforcing "Batch Invariance" — making the machine behave identically no matter how busy the server is.

Human non-determinism exists by default. Humans, because of atomical level non-determinism, will not be defined like DFA (Deterministic Finite Automata) but with NDFA's (Non-Deterministic Finite Automata).

This paper is pregnant to become a strong possible defense for liberal free will defenders. Also it supports my theological thesis I discussed at my independent theology paper.

References

[1] He, Horace and Thinking Machines Lab, "Defeating Nondeterminism in LLM Inference", Thinking Machines Lab: Connectionism, Sep 2025.

[2] Umay ŞAMLI, "A proposal Theory for Independent Theology", December 2025.