Supported models
| Model | HF ID | Params | Export status |
|---|---|---|---|
| SmolVLA | lerobot/smolvla_base | 450M | ONNX validated (max_diff=3.3e-06) |
| pi0 | lerobot/pi0_base | 3.5B | ONNX validated (max_diff=6.0e-08) |
| pi0.5 | lerobot/pi05_base | 3.62B | ONNX validated (max_diff=2.38e-07) |
| GR00T N1.6 | nvidia/GR00T-N1.6-3B | 3.29B | ONNX validated (max_diff=8.34e-07, live VLM conditioning) |
| OpenVLA | openvla/openvla-7b | 7.5B | optimum-cli export onnx + reflex.postprocess.openvla.decode_actions |
reflex models list browses the curated registry; reflex models info <id> shows benchmarks; reflex models pull <id> downloads.
OpenVLA is a vanilla Llama-2-7B VLM — there’s no custom action expert to reconstruct, so we defer to the standard HuggingFace export path and ship only the bin-to-continuous postprocess helper.
Verified parity ledger
Section titled “Verified parity ledger”Four ONNX artifacts in production, measured against PyTorch on shared seeded inputs:
| Artifact | Reference | first-action max_abs | verdict |
|---|---|---|---|
| SmolVLA ONNX, num_steps=10 (production default) | sample_actions(num_steps=10) | 5.96e-07 | machine precision |
| pi0 ONNX, num_steps=10 (production default) | sample_actions(num_steps=10) | 2.09e-07 | machine precision |
| pi0.5 ONNX, num_steps=10 (production default) | sample_actions(num_steps=10) | 2.38e-07 | machine precision |
| GR00T N1.6 ONNX, single-step DiT (DDPM, loop external) | GR00TFullStack.forward | 8.34e-07 | machine precision |
| GR00T N1.6 end-to-end 4-step denoise loop | Python loop over PyTorch ref | 4.77e-07 | machine precision |
| GR00T N1.6 Eagle VLM ONNX (SigLIP + Qwen3 + mlp1, 1.87B) | EagleExportStack PyTorch | 4.25e-04 | machine precision |
GR00T N1.6 DiT with real VLM KV (5-input expert_stack_with_vlm.onnx) | GR00TFullStack(state, vlm_kv) | 1.78e-05 | machine precision |
| GR00T N1.6 end-to-end two-ONNX chain (Eagle → DiT) | same chain in PyTorch | 1.90e-05 | parity + image-driven sensitivity verified |
| SmolVLA ONNX, num_steps=1 | sample_actions(num_steps=1) | 1.55e-06 | machine precision |
| pi0 ONNX, num_steps=1 | sample_actions(num_steps=1) | 1.43e-06 | machine precision |
About the production defaults
Section titled “About the production defaults”Flow-matching VLAs (SmolVLA, pi0, pi0.5) canonically integrate the velocity field with 10 Euler steps — the ONNX bakes in the unrolled loop. GR00T is DDPM-style diffusion with 4 canonical steps — the ONNX exports one velocity step, and reflex serve wraps it in the loop. All four match canonical PyTorch to machine precision.
Memory fit on disk (monolithic ONNX, FP32)
Section titled “Memory fit on disk (monolithic ONNX, FP32)”| Model | Disk size |
|---|---|
| SmolVLA | 1.6 GB |
| pi0 | 12.5 GB |
| pi0.5 | 13.0 GB |
| GR00T | 4.4 GB |
SmolVLA fits comfortably on Orin Nano 8 GB. pi0 realistically needs Orin 16 GB+ or a desktop NVIDIA GPU — the 12.5 GB monolithic ONNX cannot load on the 8 GB Orin Nano even in FP16 (~6 GB weights plus activations + OS). FP16 engine rebuild + Orin Nano fit work is tracked for a future release.