1729 AI Labs

Building deterministic systems from probabilistic LLMs

We research the two problems we believe matter most in AI today - making probabilistic systems reliable, and redesigning how humans and machines collaborate.

The Problem

LLMs are the most capable technology we've ever built. They are also fundamentally unreliable.

Large language models can generate useful work across many tasks. In production systems, the issue is not capability—it is operational reliability: consistent behavior, bounded failure modes, and traceability.

01 — Reliability

They hallucinate. They generate fiction with the same confidence they state facts. The failure mode isn't absence of output. It's the presence of plausible fiction.

They're unreliable. Same prompt. Same model. Same conditions. Different output. No error code. No warning.

These are not edge cases. They are expected properties of probabilistic generation. If you want deterministic outcomes, you need a system architecture that constrains, verifies, and corrects model outputs—rather than trusting a single completion.

02 — Collaboration

Meanwhile, the way we work with them — a chat box and a blinking cursor — was never designed. It was inherited. We're using the most powerful cognitive technology ever built through an interface that asks the human to do all the work.

Two unsolved problems. We work on both.

Thesis I: Engineering Determinism

How to get deterministic outputs from probabilistic LLMs.

An LLM-based system books your flight and hands you a boarding pass. You show up at the airport. The flight doesn't exist. The LLM didn't fail to act — it hallucinated a completed task, with full confidence.

Ask the same system to book New York, ten times. Nine times it works. Once, you get San Francisco. No error. No warning. No way to predict when it will happen next.

This is what probabilistic means in practice. Not a theoretical concern — a structural one. The model isn't broken. It's working exactly as designed. It's just not designed to be reliable.

We've seen this problem before.

The early internet was a lossy medium. Packets got lost, misrouted, arrived corrupted. The question was the same: how do you build reliable systems on top of an unreliable foundation?

The answer was never to make the network perfect. It was to manage imperfection — through detection, correction, retransmission, and redundancy. The component stayed lossy. The system became reliable.

LLMs are today's lossy medium.

We take the same systems approach. Instead of waiting for a model that doesn't hallucinate, we engineer the layers around it — pre-processing, post-processing, verification, correction loops — so the system is deterministic, even when the component isn't.

Don't fix the medium. Engineer around it.

Thesis II: The Allocation of Judgment

How should humans and LLMs actually work together?

Watch someone use an LLM. They struggle to write the right prompt. They paste in context the system should already have. They read a wall of generated text to check if it's correct. Then they edit it manually.

The human is doing all the wrong work.

Humans

Extraordinary at judgment — making decisions, choosing between options, saying yes or no.

Poor at articulating context, writing precise instructions, transferring knowledge.

LLMs

Exceptional at generating, documenting, transferring context, following instructions.

Poor at judgment.

The current interface asks humans to do what they are worst at, and LLMs to do what they are worst at.

It's a collaboration designed backwards.

We build interaction models built on a simple principle: Humans should only make decisions. They choose, approve, reject, and steer. The system handles the rest: gathering context, generating options, and executing the work.

The right interface isn't a better chat box. It's a fundamentally different division of labor.

Origin

Why 1729.

1729

In 1918, the mathematician G.H. Hardy visited Ramanujan in the hospital. He mentioned his taxi was numbered 1729 — a rather dull number.

Ramanujan disagreed instantly. It was the smallest number expressible as the sum of two cubes in two different ways.

He didn't compute it. He saw it. Structure, where others saw randomness.

That's the instinct we're building around. LLMs look chaotic — probabilistic, unreliable, unpredictable. We believe there is structure in that stochasticity. Not by hoping it emerges, but by engineering systems that find it and enforce it.

1729 isn't just a name. It's a thesis.

Join Us

We're looking for people who are bothered by the right problems.

If you think the chat box is the final interface, we're probably not the right place. If you think LLMs need to be perfect before they're useful, we're probably not the right place.

But if you believe reliability is an engineering problem and collaboration is a design problem — and both are solvable — we'd like to hear from you.

founders@1729labs.ai →