Fine-tuning an AI: what it really costs
Spoiler: training is the cheapest part. The real budget is elsewhere. Honest ballpark figures, and when it's worth it.
"How much does fine-tuning cost?" — the trick question
It's the first thing people ask us. And it's the wrong one, because it assumes the cost is the training. But training, today, is often the cheapest part. The real budget hides elsewhere. Let's untangle it with numbers.
Before you read on, a useful reminder: in most cases, you don't need to fine-tune at all. We explain why in this article. What follows applies to the cases where it's genuinely warranted.
Training: the cheap part
The big shift of recent years is a technique called LoRA (and its QLoRA variant). Instead of retraining the whole model, you adjust only a small fraction of its parameters. The result: 4 to 10 times cheaper than full fine-tuning, for 80 to 95% of the outcome on most tasks.
In practice, renting a GPU runs about $1.5 to $8 an hour depending on the card (A10G, A100, H100). And fine-tuning a 7-billion-parameter model with LoRA takes 1 to 2 hours on a single A100. Do the math: the raw compute bill is in the tens of euros, not the thousands.
The gap is stark on hardware: a fine-tune of a 7B needs 100 to 120 GB of GPU memory — roughly $50,000 of H100 cards for a single run. The same model fine-tunes with , in hours. That's the whole point of LoRA: it democratizes the operation.