Open source · TraceML for PyTorch

Find what's slowing
your training run,
while it's happening.

One context manager. Works in 3 minutes.

⚡

Dataloader stalling your GPU? You will see it flagged live, not after the run finishes.

🔀

DDP straggler slowing your ranks? TraceML shows which rank, how much gap, and whether it is the dataloader or compute.

📈

Step time drifting? Visible before the run finishes so you can stop it, not wait for it to crash.

⭐ Try TraceML on GitHub Talk to us about your workload

DATALOADER STALL · step 1,240

Step time — last 100 steps

Median 23.1msWorst 25.9ms

DL fetch

13.4ms

Forward

4.4ms

Backward

3.2ms

Optimizer

1.8ms

DDP — 4 ranks

Straggler 1.00× ✓

Memory

GPU mem 14.2 / 96 GB peak 17.1 GB ↑

Simple setup

No agents. No infrastructure. Pip install and one context manager, that's all .

pip install `traceml-ai`

No system dependencies. Works where PyTorch runs.

Wrap your training step

One context manager around your existing loop. No other changes to your script.

Run your script

Live terminal view opens alongside your logs. Compact summary at run end.

train.py only change needed

from traceml.decorators import trace_step

for batch in dataloader:
    with trace_step(model):
        outputs = model(batch["x"])
        loss = criterion(outputs, batch["y"])
        loss.backward()
        optimizer.step()
        optimizer.zero_grad(set_to_none=True)

$ traceml run train.py

Plain PyTorch

PyTorch training loops

Use trace_step(model) around your step. Single GPU and single-node DDP.

Hugging Face

HF Trainer

Replace Trainer with TraceMLTrainer. One line change.

Lightning

PyTorch Lightning

Add TraceMLCallback() to your trainer callbacks.

Try it or talk to us

TraceML is free and open source. If you are running regular training jobs and want a second set of eyes on what is actually slow, we are easy to reach.

⭐ Try TraceML on GitHub Email us User survey Report an issue

We work with a small number of teams directly, looking at real run data. If that sounds useful, just email.

Find what's slowingyour training run, while it's happening.

Simple setup

Try it or talk to us

Find what's slowing
your training run,
while it's happening.