how to

How to make an LLM output valid DSL every time (without fine-tuning)

the short answer

To make an LLM output valid DSL every time, compile your DSL's grammar into a decoding constraint so the model can only emit tokens that stay inside the language, then verify with a deterministic parser instead of trusting the model — which is exactly what dslai does from a grammar you paste in, no fine-tuning required.

Prompting an LLM to 'only output valid <your language>' gets you most of the way and then fails at the worst time — a stray token, a missing delimiter, a keyword that doesn't exist — because a prompt is a request, not a rule. If your DSL feeds a parser downstream, 'mostly valid' means 'occasionally broken in production'.

The reliable fix is to stop hoping and start constraining. Here's the recipe dslai follows, and how to read what it gives back.

0training examples needed — the grammar is enough
dslai · playground
# your dsl grammar
rule = "alert " metric
cmp number win? action ;
cmp = ">" | "<" | ">=" ;
number = digit+ unit? ;
✓ grammar parses
guaranteed-valid output
alert cpu > 90% for 5m page on
alert mem >= 8gb notify sre
alert p99 > 250ms scale web
validate input
✗ invalid at pos 14, expected "%"

where this happens in the app

dslai compiles your grammar into a decoding constraint — generated output is syntactically valid by construction, and a deterministic validator pinpoints anything that isn't.

  1. 1paste your dsl's grammar — the same rules you'd write for a parser, no training set.
  2. 2every generated line is sampled inside the grammar, so it always parses (the ✓ marks).
  3. 3the validator returns the exact position and expected token on invalid input.

Why prompting alone isn't enough

A language model samples tokens from a probability distribution. Even a model that has seen your DSL can assign nonzero probability to a token that breaks it, and over enough generations that token eventually gets sampled. No amount of prompt wording removes the possibility, because the wrong tokens are still on the menu.

The structural fix is to take them off the menu. If the grammar says only certain tokens are legal next, mask the rest before sampling. Now invalid output isn't unlikely — it's unreachable.

Verify with a parser, not the model

Generation handles 'produce valid DSL'; you still want to confirm 'is this string valid DSL', and that's a parser's job, not a model's. dslai generates a deterministic validator from the same grammar, so checking is exact and reproducible: it returns valid, or invalid at a specific position with what it expected there.

That separation matters — an LLM asked to grade its own output can be confidently wrong, while a parser cannot. You get a compiler-style verdict you can act on.

how it works

  1. 01

    bring your grammar

    Paste your DSL's EBNF/GBNF-style grammar into dslai — no training set.

  2. 02

    compile the constraint

    dslai turns the grammar into a decoding mask that keeps output inside the language.

  3. 03

    generate

    Produce snippets that are syntactically valid by construction.

  4. 04

    validate

    Run any input through the deterministic parser to confirm it — and see exactly where it breaks if not.

frequently asked

Will this work for my custom language specifically?
If you can express it as a grammar — which you can if it has a parser or a spec — dslai can constrain generation to it and validate against it. The playground lets you paste your grammar and try it immediately.
Does constraining hurt the model's quality?
It only removes syntactically illegal options; the model still chooses freely among valid ones. For semantic quality on top, you can add retrieval over your docs or, for hard cases, fine-tune a small model.
What if my grammar is ambiguous?
Constrained generation still only produces strings in the language; dslai's validator handles ambiguity by exploring valid parses and accepts input if any parse consumes it.
Is there an API I can call from my pipeline?
A hosted API and a CI validation check are on the roadmap as paid tiers. The free playground today runs the same engine in your browser so you can prove the approach first.

Last updated June 7, 2026

ready to try dslai?

open dslai