Deploy a model in one command

reflex go is the headline command. One invocation takes you from “I have an HF model ID” to “my robot is calling /act on a server.”

The basic shape

reflex go --model <hf_id>

That’s it for a smoke test. The server starts on :8000 returning unscaled raw actions.

Add per-robot normalization

reflex go --model smolvla-base --embodiment franka

The --embodiment flag picks an action-range mapping plus enables ActionGuard for joint-limit clamping. Built-in embodiments: franka, so100, ur5. Custom: --custom-embodiment-config <path> pointing at a JSON or YAML file.

Force a specific hardware target

By default reflex go reads reflex doctor output and picks a target. Override:

reflex go --model pi05-base --embodiment franka --device-class a10g

Valid --device-class values: orin-nano, orin, orin-64, thor, a10g, a100, h100, desktop (RTX), cpu.

Composable runtime wedges

Every wedge is a flag on reflex serve (and flowed through reflex go):

reflex serve ./p0 \
  --embodiment franka \                   # per-robot action ranges + ActionGuard
  --safety-config ./robot_limits.json \   # URDF-derived joint limits + EU AI Act audit log
  --adaptive-steps \                      # stop denoise loop early on velocity convergence
  --deadline-ms 33 \                      # return last-known-good action if over budget
  --cloud-fallback http://cloud:8000 \    # edge-first with cloud backup
  --inject-latency-ms 0 \                 # synthetic delay (B.4 A2C2 gate methodology)
  --record /tmp/traces \                  # JSONL request/response capture for replay
  --max-consecutive-crashes 5             # circuit breaker (503 + Retry-After: 60 on trip)

Every response surfaces telemetry from each enabled wedge (guard_clamped, guard_violations, injected_latency_ms, inference_mode, adaptive_enabled, etc.).

Pre-flight validation

Before shipping to production, validate the dataset and the export:

# Will it train?
reflex validate dataset /path/to/lerobot_data --embodiment franka --strict

# Does it serve cleanly with ONNX-vs-PyTorch parity?
reflex validate export ./p0 --model lerobot/pi0_base --threshold 1e-4

Sample passing output:

Per-fixture results
fixture_idx  max_abs_diff  mean_abs_diff  passed
0            3.21e-06      8.40e-07       PASS
1            2.98e-06      7.92e-07       PASS
...
Summary
max_abs_diff_across_all  3.21e-06
passed                   PASS

Exit codes: 0 pass, 1 fail (any fixture above threshold), 2 error (missing ONNX, bad config). Pipe --output-json for CI consumption, or run reflex validate --init-ci to scaffold a GitHub Actions workflow.