Skip to content

distill — SnapFlow 1-step

tether train distill runs SnapFlow 1-step distillation: from a 10-step flow-matching teacher (pi0, pi0.5, SmolVLA), train a 1-step student that retains — or in our reproduction, exceeds — the teacher’s task success rate.

Student beats teacher on libero_object, N=50: 64% vs 56%.

First public open-source reproduction we’re aware of. The student is a separately exportable ONNX with 10× lower inference cost (1 forward pass vs 10).

StageModal costTime
Distill (A100, 10K steps)~$8-1260-90 min
Eval against LIBERO N=50~$3-530 min
Terminal window
# Distill a customer-specific student from a pi0.5 teacher
tether train distill ./teacher-export/ \
--output ./student-export/ \
--steps 10000 \
--runtime modal
# Validate the student's parity against the teacher
tether validate export ./student-export/ \
--reference ./teacher-export/ \
--threshold 1e-3
# Eval on LIBERO
tether eval ./student-export/ --suite libero --num-episodes 50

SnapFlow is self-distillation, not teacher-supervised. The trick: the teacher and student share a backbone — the student is a single Euler step initialized from the teacher’s velocity field, then trained to match the teacher’s full unrolled 10-step trajectory.

Key implementation notes:

  • Teacher in eval mode, frozen weights — gradients flow through the student only
  • Loss: L2 on the integrated trajectory, not per-step velocity
  • Mix ratio 50/50 — half the batches are teacher rollouts, half are noise samples (regularizes against narrow distribution)
  • State-out distillation — the student takes state as an explicit input, not as part of the language conditioning. This unlocks prefix KV cache hits in production.
FlagDefaultNotes
<teacher_export>requiredDirectory with teacher ONNX + config
--output./student-export/Where the student lands
--steps10000Distill iterations
--mix-ratio0.5Fraction of teacher-rollout batches
--runtimemodalmodal or local
--lr1e-4AdamW learning rate
--batch-size32Per-step batch
--validate-afterfalseRun tether validate export against teacher post-distill

tether serve --pro --collect-data --distill-schedule nightly runs the entire loop continuously: collect production traffic, distill a customer-specific student every N hours, gate via a 9-gate methodology, hot-swap the new student. See self-distilling serve for the full Pro surface.

Terminal window
modal run scripts/modal_distill_pi05_libero.py

This runs a 10K-step distill against lerobot/pi05_libero_finetuned_v044 on Modal A100, then evaluates the student on libero_object N=50. Expected: student matches or beats teacher within statistical noise.

Reproduce the run with the Modal script above; results land alongside eval_output/. The student-beats-teacher result on pi05_libero_finetuned_v044 was published in the v0.7 changelog.

The open-source path covers offline distillation: you supply a teacher, you get a student. The Pro tier adds:

  • 4-stage continuous loop (collect / distill / eval / swap)
  • 9-gate eval methodology (3 SAFETY non-overridable, 6 PERFORMANCE overridable with audit)
  • Atomic warm-swap via the policy-versioning router
  • 24-hour post-swap monitoring with auto-rollback
  • Customer-specific HF Hub artifact storage

See self-distilling serve — same wedge, Pro license.

The wedge is built on SnapFlow (released 2026-04). Three concurrent or follow-up methods are worth knowing about — they target the same single-step inference goal via different math:

PaperApproachImplication for Tether
Mean-Flow VLASingle-step action via a learned mean-flow vector field — no consistency constraintsDifferent math, similar single-step outcome. Threatens SnapFlow’s “first public 1-NFE pi0.5” headline.
FASTER (Horizon-Aware Schedule for flow VLAs)Per-action-index NFE — 1 step on the leading action, more on the tail. 10× compression on pi0.5 / X-VLA.Reframes the problem the A2C2 wedge tries to solve. Phase 2 architectural pivot tracks this.
AsyncVLAPer-token non-uniform denoising + confidence-rater self-correctionBuilt on the per-step expert export contract that v0.8.0 ships. Tether’s decomposed export unblocks AsyncVLA-style methods on existing pi0/pi05 weights without retraining.

Tether’s distill wedge will track these methods — when one of them produces a stronger student than SnapFlow on the same benchmarks, we’ll add it as a backend (tether train distill --method mean-flow, etc.). The shipping commitment is to whichever method produces the best 1-NFE result at the time, not loyalty to SnapFlow specifically.