Skip to main content

Fine-Tuning

Fine-tune and deploy, end to end

Turn your data into a custom adapter without managing any infrastructure. Upload, train, evaluate, and deploy — billed by the minute for the GPU time you actually use.

How it works

From dataset to deployed adapter

A guided pipeline with a checkpoint you control before every training run.

  1. 1

    Upload

    Add a JSONL dataset of examples.

  2. 2

    Validate

    Automatic format and length checks.

  3. 3

    Synthesize

    Optional teacher-generated examples.

  4. 4

    Train

    LoRA on your chosen base model.

  5. 5

    Evaluate

    Perplexity and exact-match metrics.

  6. 6

    Deploy

    Serve the adapter behind an endpoint.

What you get

Custom models, no ops

Everything needed to specialize an open model and serve it in production.

LoRA fine-tuning

Configure rank and alpha, and train adapters on base models like Qwen 2.5 7B, Qwen 3.5 35B-A3B, Llama 3.1 8B, and Gemma 4 31B.

Bring your own data — or synthesize it

Upload a JSONL dataset, or generate training examples from an open teacher model (DeepSeek-R1, Qwen3, GLM-4.5, Kimi K2) and review them before training.

Pay per minute

Billed for actual GPU training time. Failed validation or generation isn't charged.

Multi-LoRA serving

Deploy many adapters on a single inference instance without restarts.

BYOK with KMS encryption

Bring your own teacher API key — it's sealed with KMS envelope encryption and never stored in plaintext.

Evaluation metrics

Get perplexity and exact-match metrics for every run.

Train your first adapter

Upload a dataset and deploy a fine-tuned model — pay only for the training time you use.