What fine-tuning can and can't give you
Fine-tuning shifts a model's probabilities toward your examples. That can improve semantic quality — idiomatic structure, sensible defaults, the conventions your team uses — especially on a small model you want to self-host cheaply. What it can't give you is a syntactic guarantee: a fine-tuned model still occasionally emits output that breaks the grammar, because training changes likelihoods, not rules.
So if your problem is 'the output is sometimes invalid', fine-tuning is the wrong fix. If your problem is 'the valid output isn't idiomatic enough', fine-tuning is a candidate — on top of a constraint, not instead of one.
The cheaper wedge to try first
Before any training, constrain decoding to your grammar so output is valid by construction, and add retrieval over your docs and examples for context. For most DSLs that combination clears the bar, costs nothing to run, and needs no labelled data. dslai's playground lets you confirm it against your own grammar in the browser.
If after that you still need more semantic polish on a small self-hosted model, fine-tune — and serve it multi-LoRA (shared base model, adapter swapped per request) so a per-customer adapter doesn't mean a per-customer warm GPU.
Reach for fine-tuning when…
| Situation | Better tool |
|---|---|
| Output is sometimes syntactically invalid | Constrained decoding |
| You have only a grammar, no dataset | Constrained decoding + retrieval |
| Valid output isn't idiomatic on a small model | Fine-tuning (on top of the constraint) |
| You need it cheap and self-hosted | Fine-tune + multi-LoRA serving |
frequently asked
- How much data do I need to fine-tune on a DSL?
- Usually hundreds to thousands of examples for the model to learn the patterns well — which is why most teams should constrain first and only fine-tune once they have, or can synthesise, a real dataset.
- Why not just fine-tune a frontier model?
- Fine-tuning makes most sense for small, cheap, self-hosted open models where you control serving costs. Constraining works on any model and gives the syntactic guarantee fine-tuning can't.
- What is multi-LoRA serving and why does it matter?
- It serves many lightweight fine-tuned adapters on one shared base model, swapping the adapter in per request, so you don't pay for a dedicated warm GPU per customer. It's what makes per-DSL fine-tuning economical.
- Can dslai fine-tune for me today?
- Fine-tuning is a planned premium tier. The free product today is the constrained-generation and validation wedge, which is what most DSLs actually need.
Last updated June 7, 2026