The LLM is a compiler you can’t read

Every prompt engineer in the industry is fighting the same opponent. Clever system messages. Few-shot examples. The phrase “step by step” sprinkled into the user turn. Hope. Iteration. Hope again. None of it sticks because every word in your prompt is fighting the architecture of the model.

The LLM is a left-to-right token predictor. Every token it generates is conditioned on every token before it. There is no hidden planner. The model builds a sequence, one token at a time, and the order of commitments in that sequence is the order of fields in the JSON schema you hand it.

Once you see that, the entire prompt-engineering industry collapses. You don’t write clever system messages. You design the type signature. You order the fields. You put the boolean commitments first, before the model can reach the content fields where hallucination lives.

Reasoning pipes do exactly this. They’re what makes TPipe deterministic in production.

A boolean that fixes a class of hallucinations

Look at this Kotlin data class. It’s from src/main/kotlin/Structs/ModelReasoning.kt, line 454, in TPipe’s SemanticDecompressionResponse:

/**
 * @param doesLegendExist Top labeled boolean used to force the llm to predict against weather there is or
 * is not a legend present at all. This is required because without this smaller models tend to just hallucinate
 * values to fulfil the desire to have non-empty values in the json output. But by forcing it to acknowledge that
 * no legend exists when it's empty, it should prevent the hallucinated values from appearing.
 */
@kotlinx.serialization.Serializable
data class LegendAnalysis(
    var doesLegendExist: Boolean = false,
    var codesFound: List<String> = listOf(),
    var mappings: List<String> = listOf()
)

The comment is the documentation. The boolean is first. The lists come after. This is not a coincidence.

Here’s what happens without the boolean. A smaller model receives a compressed prompt with no legend block. The model has been told to fill in a JSON response. The model wants to fill in the response. The model is going to invent a legend because the alternative is to admit it has nothing to say, and admitting that is statistically unlikely given the prompt context. The model hallucinates codesFound: ["AB", "AC", "BA"] and mappings: ["AB: My Company", "AC: The Product", "BA: The Customer"]. The output is now wrong in a way that downstream code cannot detect.

Here’s what happens with the boolean. The model receives the same prompt. The first field it has to fill is doesLegendExist. The model commits: doesLegendExist: false. By the time the model gets to codesFound, the prediction is constrained. The model has already said no. The list stays empty. The output is correct.

One boolean. One structural position. One class of bugs eliminated across every model that runs this reasoning method.

The same trick is used everywhere in TPipe. The default for boolean fields in the JSON output instructions is false. The default for integer fields is 0. The default for lists is empty. The model is told in the system prompt: “Never use null as a value — instead provide appropriate default values: empty strings for text fields, empty arrays for lists, empty objects for nested structures, 0 for numbers, and false for booleans.”

That single instruction is doing the work of a thousand prompt-engineering blog posts.

How the JSON schema gets into the prompt

The reasoning pipe’s response data class is converted to a JSON Schema via reflection, then injected into the system prompt along with the explicit output rules. From src/main/kotlin/Pipe/Pipe.kt, line 1944:

val defaultJsonOutput = """\n\n You must return your output only in Json format. You may not generate
    |any text that is not in json format. You may not generate any text before, or after the json output. 
    |All variables in the json output must have valid values that match their declared types. 
    |Never use null as a value - instead provide appropriate default values: empty strings for text fields, 
    |empty arrays for lists, empty objects for nested structures, 0 for numbers, and false for booleans. 
    |
    |The json output schema is as follows: ${jsonOutput}
    |
    |You must only return json that matches the variable types, and names in this schema exactly. Do not include any text above or below the json.
    |Do not change the name of any json variables. Do not change the name of the json object itself either.
    |CRITICAL: Every field must contain a valid value of the correct type - never use null values.
""".trimMargin()

The model sees the schema. The model sees the rules. The model is told what defaults to use when a value is empty. The model has no choice but to comply or fail the deserialization check on the way out.

TPipe’s JsonSchemaGenerator produces Draft 2020-12 compliant schemas from any Kotlin serializable class, with nullable types, sealed classes, polymorphic types, and circular references all handled. The schema is the contract, the LLM is the runtime that fills it in, and the deserializer is the type checker that fails the call if the shape is wrong.

The reasoning pipe’s output is then unravel()ed — a deterministic function on the data class that flattens the structured JSON back into a thought stream the parent pipe consumes. For SemanticDecompressionResponse, the unravel emits the legend mappings, the content identification reasoning, the task identification, the key data points, the restored sentences, the decompression strategy, and the final restored content. The parent pipe receives a coherent narrative, but that narrative was constructed from structured fields in a specific order. The parent pipe never sees hallucinated garbage because the structure prevented the garbage from being generated.

The reasoning methods are all railroads

Every reasoning method in TPipe is a different data class with a different field order. The field order is the reasoning order. The model cannot skip steps because the steps are fields in the schema.

StructuredCot — analyze, decompose, execute, synthesize:

data class StructuredCot(
    var componentIdentification: ComponentIdentification = ComponentIdentification(),
    var solutionDecomposition: SolutionDecomposition = SolutionDecomposition(),
    var systematicExecution: SystemicExecution = SystemicExecution(),
    var reasoningSynthesis: ProcessFocusedResult = ProcessFocusedResult()
)

The model commits to identifying the components before decomposing the solution. It decomposes before executing. It executes before synthesizing. You cannot make the model skip the decomposition step because the next field requires the previous field’s content. The schema is the program.

MethodActorResponse (role play) — the character commits first, the problem comes second:

data class MethodActorResponse(
    var characterProfile: CharacterPerspective = CharacterPerspective(),
    var problemView: CharacterAnalysis = CharacterAnalysis(),
    var inCharacterThinking: CharacterReasoning = CharacterReasoning(),
    var characterSolution: CharacterSolution = CharacterSolution(),
    var signatureStyle: String = ""
)

The model fills in the character’s background, expertise, worldview, and terminology before it sees the problem. The character is locked in. The problem interpretation is filtered through the character. The reasoning is in-character. The solution is in-character. The signature style is the closing flourish. You cannot skip the character setup because the problem view requires the character profile as context.

MultiPhasePlan (comprehensive planning) — constraints come first, phases come after:

data class MultiPhasePlan(
    var analysis: TaskAnalysis = TaskAnalysis(),
    var phases: List<Phase> = listOf(),
    var howToMeasureSuccess: List<String> = listOf(),
    var totalDuration: String = ""
)

The TaskAnalysis is filled with what needs solving, the limitations, the desired outcomes, and the important factors. The model cannot propose phases without first acknowledging the constraints. The phases come with their own risks, mitigations, and backup plans. A happy-path plan is structurally impossible — the schema forces the model to enumerate the risks at every phase.

ChainOfDraftResponse — the 5-word ceiling is structural:

data class ChainOfDraftResponse(
    var problemAnalysis: String = "",           // Brief problem statement (5 words max)
    var draftSteps: List<DraftStep> = listOf(), // Constrained reasoning steps
    var finalCalculation: String = "",          // Final operation (5 words max)
    var answer: String = ""                     // Final answer
)

The DraftStep data class has a draftContent field that the prompt instructs the model to keep under 5 words. The model cannot generate verbose reasoning in this method. The structural constraint lives in the data class. The prompt reinforces it. The result is a 75% token reduction compared to standard Chain-of-Thought and a 78% latency reduction in production. Those gains come from the schema, not from prompt wording.

This is what “stop prompting, start programming” means

The phrase is the thesis. The LLM is a function. The data class is its type signature. The schema is its API contract. The unravel function is its return-value transformer. You stop negotiating with the model. You wire it up like any other component.

When you build with TPipe, you do not write a 500-line system prompt and pray. You define a data class. You mark it @Serializable. You pass it to setJsonOutput(). TPipe generates the schema, injects it into the prompt, and the LLM fills it in. The deserializer checks the output. If the output does not match the schema, the pipe fails loudly. If the output matches, the unravel function produces a coherent thought stream the parent pipe can consume.

val reasoningSettings = ReasoningSettings(
    reasoningMethod = ReasoningMethod.SemanticDecompression,
    depth = ReasoningDepth.Med,
    duration = ReasoningDuration.Med,
    reasoningInjector = ReasoningInjector.SystemPrompt
)

val reasoningPipe = reasonWithBedrock(bedrockConfig, reasoningSettings, pipeSettings)

val mainPipe = BedrockPipe()
    .setSystemPrompt("Solve the user's problem.")
    .setReasoningPipe(reasoningPipe)
    .setTokenBudget(TokenBudgetSettings(reasoningBudget = 2000))

The reasoning pipe is a function call whose output type is SemanticDecompressionResponse. The LLM is the compiler for that type. You can log it, diff it across model versions, write a unit test that pins the field order, and swap the model underneath without breaking the contract.

Why this scales when prompts do not

Three reasons.

Prompts drift across model versions. When the model provider ships a new version, your carefully tuned prompt produces different output. The schema does not. The data class is the contract. The contract holds across model versions because the schema forces the model into the same shape regardless of which model fills it in.

Prompts are opaque to logs and traces. When something goes wrong, you cannot diff a prompt. You can diff a schema. You can see which field is wrong, which field is missing, which field is hallucinated. The structured output is observable. The prompt is a black box.

Prompts cannot be tested without running the model. You cannot unit-test a prompt. You can unit-test a data class. You can write a test that asserts the unravel function produces the right thought stream for a given input. You can write a test that asserts the JSON output deserializes correctly. You can write a test that pins the field order of the data class so a refactor does not break the contract. None of this is possible with prompts.

TPipe has been running in production at TTT for over 18 months. Autogenesis processes hundreds of millions of tokens with zero drift failures. The judge is unjailbreakable. Players screenshot bad calls and there is no operator to fix mistakes, by design. None of that works without the structural determinism the reasoning pipes provide.

How to use this in your own LLM work

The pattern is portable. You do not need TPipe to use it. You need a JSON schema generator, a system prompt injection mechanism, and the discipline to put your boolean commitments first.

In Python, the equivalent is a Pydantic model with Field(default=False) for the boolean and Field(default_factory=list) for the lists. Generate the JSON schema with model_json_schema(). Inject the schema into your system prompt. Append the same output rules TPipe uses: never use null, use empty defaults, return only JSON, the schema is the contract.

In TypeScript, the equivalent is a Zod schema. Generate the JSON schema with zod-to-json-schema. Inject it. Same rules.

The trick is the field order. Pydantic models and Zod schemas preserve declaration order in the generated JSON schema. Put your booleans first. Put your integer commitments first. Put your precondition fields before the content fields. Test the order explicitly — move the boolean to the bottom, watch the hallucinations return, move it back to the top, watch them disappear. Document this for your team. Add a test that pins the order.

The LLM is a compiler you cannot read. You can still write code that compiles deterministically against it. The data class is the program. The schema is the contract. The order of fields is the order of commitments. Everything else is prompt theater.

Where reasoning pipes lead

Reasoning pipes are one of TPipe’s intervention mechanisms, sitting alongside Developer-in-the-Loop pipes as a separate subsystem. The KillSwitch propagates through the call chain as an uncaught exception. The ContextBank persists memory across sessions. The DistributionGrid coordinates across nodes. The reasoning pipes are the layer that turns LLM output into typed, structured, testable data.

The next post in this series is The KillSwitch: Token Budgets That Actually Kill the Agent: how forced termination bypasses retry policies, how it propagates through the call chain, and why KillSwitch is termination architecture, not a feature. The third covers the Autogenesis deployment — billions of tokens, zero human intervention, a judge that cannot be jailbroken because the reasoning pipe structure prevents the failure modes other systems are vulnerable to. TTT lost count of the total somewhere along the way. The fourth covers migrating from LangChain, where the contrast between “type your way into determinism” and “prompt your way into determinism” makes the architectural argument concrete.

The choice is between treating the model as a partner you negotiate with and treating it as a component you wire up. TPipe treats it as a component. Billions of tokens, eighteen months of production, and a team that expected to see the structural failures by now and still hasn’t.