reflex doctor
reflex doctor --model ./my-export/10 falsifiable checks. Each maps to a known LeRobot GitHub issue or systemic VLA deploy failure mode. Every check has at least one pass test and one fail test in tests/test_doctor_diagnostics.py per the falsifiability gate.
The 10 checks
Section titled “The 10 checks”| ID | Name | What it tests |
|---|---|---|
check_model_load | Model load | Export dir exists + contains ONNX + fits in available RAM (×1.4 overhead, 20% headroom) |
check_onnx_provider | ONNX provider | onnxruntime importable + CPU EP present (always required) + GPU EP noted |
check_vlm_tokenization | VLM tokenization | Tokenizer config loads + 5 probe prompts produce in-range token IDs |
check_image_dims | Image dim mismatch | embodiment.cameras[*].resolution appears in ONNX image input shape |
check_action_denorm | Action denormalization | embodiment.normalization.mean_action / std_action length == action_dim, no NaN/Inf, std > 0 |
check_gripper | Gripper config | gripper.component_idx < action_dim, close_threshold ∈ [0, 1], inverted flag sanity |
check_state_proprio | State/proprio dtype | ONNX state input is float32 (not float64 — silent truncation drops fps to ~0.3) |
check_gpu_memory | GPU memory | nvidia-smi reports ≥ 90% headroom over estimated model footprint (×1.6 file size for KV + activations) |
check_rtc_chunks | RTC chunk boundary | chunk_size ≥ frequency_hz × rtc_execution_horizon (one horizon’s worth of actions) |
check_hardware_compat | Hardware compat | CUDA driver ≥ 12.x + ORT GPU EP present when CUDA detected |
Each check links back to a load-bearing LeRobot GitHub issue. The full table with issue links is in docs/doctor_check_list.md in the source repo.
CheckResult contract
Section titled “CheckResult contract”Every check returns a CheckResult (see src/reflex/diagnostics/__init__.py):
| Field | Type | Notes |
|---|---|---|
check_id | str | Stable ID (e.g. check_model_load); used by --skip |
name | str | Human-readable name |
status | enum | pass / fail / warn / skip |
expected | str | What the check wanted to see |
actual | str | What it actually saw |
remediation | str | Required when status="fail". Empty otherwise. |
duration_ms | float | Wall-clock for the check |
github_issue | str or None | URL to the load-bearing LeRobot issue |
Falsifiability gate: CheckResult.__post_init__ raises ValueError if status="fail" and remediation is empty. Enforced at construction time so a check with no fix-it suggestion can never ship.
Status semantics
Section titled “Status semantics”- pass — verified the expected condition. No action.
- fail — verified a known-broken condition. Doctor exits 1. Caller should follow
remediation. - warn — non-blocking concern (e.g. CPU-only on a system that should have GPU). Doctor exits 0 but the warning is surfaced.
- skip — couldn’t run because a precondition wasn’t met (e.g. embodiment is
customso embodiment-dependent checks have nothing to compare against).
Example output
Section titled “Example output”Text format (default)
Section titled “Text format (default)”Reflex Doctor v0.7.0Checking: ./my-export/
✓ Model load (12 ms)✓ ONNX provider (8 ms) — TensorRT, CUDA, CPU EPs available✓ VLM tokenization (43 ms)✓ Image dim mismatch (3 ms) — 224×224 matches export✓ Action denormalization (2 ms)✓ Gripper config (2 ms)✗ State/proprio dtype (1 ms) Expected: state input is float32 Actual: state input is float64 Fix: Cast state to np.float32 before sending to /act. Float64 silently truncates to float32, dropping fps to ~0.3 in production. See https://github.com/huggingface/lerobot/issues/2458✓ GPU memory (15 ms) — 18 GB available, 4 GB needed⚠ RTC chunk boundary (1 ms) Expected: chunk_size ≥ frequency_hz × rtc_execution_horizon (15 ≥ 30 × 0.5) Actual: chunk_size = 15, frequency_hz = 30, rtc_execution_horizon = 0.5 (15 = 15 — passes by hair; recommend chunk_size = 25 for headroom)✓ Hardware compat (84 ms)
Summary: 8 pass / 1 warn / 1 failExit: 1 (fail)JSON format (CI-friendly)
Section titled “JSON format (CI-friendly)”reflex doctor --model ./my-export/ --format json | jq .{ "schema_version": 1, "reflex_version": "0.7.0", "model_path": "./my-export/", "embodiment": "franka", "checks": [ { "check_id": "check_model_load", "name": "Model load", "status": "pass", "expected": "...", "actual": "...", "remediation": "", "duration_ms": 12.4, "github_issue": "https://github.com/huggingface/lerobot/issues/386" }, /* ... */ ], "summary": {"pass": 8, "warn": 1, "fail": 1, "skip": 0}}Schema v1 is locked — additive fields don’t bump version, breaking changes do.
Adding check #11
Section titled “Adding check #11”- Create
src/reflex/diagnostics/check_<name>.pywith a_run(model_path, embodiment_name, **kwargs) -> CheckResultfunction - At the bottom:
register(Check(check_id=..., name=..., severity=..., github_issue=..., run_fn=_run)) - Import the new module in
_ensure_registry_loaded()insrc/reflex/diagnostics/__init__.py - Add at least 1 pass test + 1 fail test to
tests/test_doctor_diagnostics.py - Update the canonical doc table
The registry is auto-loaded; no other wiring needed.
Skipping checks
Section titled “Skipping checks”# Skip specific checks (CSV, by check_id)reflex doctor --model ./my-export/ --skip check_gpu_memory,check_hardware_compat
# Run only environment checks (no model needed)reflex doctorSkipped checks return status=skip with a reason. Don’t silently drop them — operators want to see what wasn’t verified.