The Deterministic AI Pipe-Dream

So there I was… #

So there I was applying to another startup on Y-Combinator a few Mondays ago, and in my cover letter I made a joke about how difficult it is to get deterministic responses out of LLMs. Almost immediately after I sent the email, I instantly came up with a better joke I should have used instead:

How many LLMs does it take to screw in a light bulb?

One to screw in the light bulb, and N+1 more to make sure the previous LLM really did screw in the light bulb and wasn’t just hallucinating.

Okay Peter, explain the joke #

As a person who built a web-based utility to consistently retrieve and organize loose data from PDF docs via AI and parsing, I have been on the wrong end of trying to get AI to deliver consistent, deterministic behavior like a human ought to.

I can tell llama-3 to pull out the first email on my resume 1000 times, but the variance I get is astounding. Most times I get the email, but some times I get other random results. If you’ve tried to do AI parsing before, you know my pain.

That experience lead me to try to make AI more deterministic. If I ask what’s 2+2, I want to see 4, every time. But that endeavor was mostly fruitless. The only way I could make AI “slightly” more deterministic was by adding more double-checking calls to the LLM.

Hence the n+1 leading to more (but never perfect) accuracy.

Enter Self-Doubt #

During this little experiment, a nagging thought entered my brain and withstood all attempts to evict it: Am I asking for something impossible?

Humans aren’t deterministic. Humans built AI. Why am I trying to hold AI to a standard impossible for any human to achieve? There I was, asking AI to be better than the average person, but its all built on crap data from millions of average people (i.e. reddit and x).

Maybe Deterministic AI was just a pipe-dream after all.

Oof, Prophecy Time! #

Maybe this is one of those natural-law type barriers. Oil and water don’t mix, energy is conserved, never created, absolute power corrupts absolutely, etc. Maybe LLMs will never be able to produce deterministic results.

I can’t help thinking that I’m the problem.

Maybe my need for square pegs and square holes has betrayed a fundamental tenet of the universe: determinism is relative. In the end, I decided to let ChatGPT write its own AI light bulb joke. It came up with a pretty good one:

How many LLMs does it take to screw in a light bulb?

Just one—but first, it will generate 500 words on the history of light bulbs, debate whether it should be “screw in” or “install,” and ask if you’d like it in the style of Shakespeare, Hemingway, or a dystopian cyberpunk noir.

Maybe determinism is the opposite of creativity.