Fine-Tuning

Fine-tune and deploy, end to end

Turn your data into a custom adapter without managing any infrastructure. Upload, train, evaluate, and deploy — billed by the minute for the GPU time you actually use.

How it works

From dataset to deployed adapter

A guided pipeline with a checkpoint you control before every training run.

1
Upload
Add a JSONL dataset of examples.
2
Validate
Automatic format and length checks.
3
Synthesize
Optional teacher-generated examples.
4
Train
LoRA on your chosen base model.
5
Evaluate
Perplexity and exact-match metrics.
6
Deploy
Serve the adapter behind an endpoint.

What you get

Custom models, no ops

Everything needed to specialize an open model and serve it in production.

LoRA fine-tuning

Configure rank and alpha, and train adapters on base models like Qwen 2.5 7B, Qwen 3.5 35B-A3B, Llama 3.1 8B, and Gemma 4 31B.

Bring your own data — or synthesize it

Upload a JSONL dataset, or generate training examples from an open teacher model (DeepSeek-R1, Qwen3, GLM-4.5, Kimi K2) and review them before training.

Pay per minute

Billed for actual GPU training time. Failed validation or generation isn't charged.

Multi-LoRA serving

Deploy many adapters on a single inference instance without restarts.

BYOK with KMS encryption

Bring your own teacher API key — it's sealed with KMS envelope encryption and never stored in plaintext.

Evaluation metrics

Get perplexity and exact-match metrics for every run.

Train your first adapter

Upload a dataset and deploy a fine-tuned model — pay only for the training time you use.

Fine-tune and deploy, end to end

From dataset to deployed adapter

Upload

Validate

Synthesize

Train

Evaluate

Deploy