Supported models

Model	HF ID	Params	Export status
SmolVLA	`lerobot/smolvla_base`	450M	ONNX validated (max_diff=3.3e-06)
pi0	`lerobot/pi0_base`	3.5B	ONNX validated (max_diff=6.0e-08)
pi0.5	`lerobot/pi05_base`	3.62B	ONNX validated (max_diff=2.38e-07)
GR00T N1.6	`nvidia/GR00T-N1.6-3B`	3.29B	ONNX validated (max_diff=8.34e-07, live VLM conditioning)
OpenVLA	`openvla/openvla-7b`	7.5B	`optimum-cli export onnx` + `reflex.postprocess.openvla.decode_actions`

reflex models list browses the curated registry; reflex models info <id> shows benchmarks; reflex models pull <id> downloads.

OpenVLA is a vanilla Llama-2-7B VLM — there’s no custom action expert to reconstruct, so we defer to the standard HuggingFace export path and ship only the bin-to-continuous postprocess helper.

Verified parity ledger

Four ONNX artifacts in production, measured against PyTorch on shared seeded inputs:

Artifact	Reference	first-action max_abs	verdict
SmolVLA ONNX, num_steps=10 (production default)	`sample_actions(num_steps=10)`	5.96e-07	machine precision
pi0 ONNX, num_steps=10 (production default)	`sample_actions(num_steps=10)`	2.09e-07	machine precision
pi0.5 ONNX, num_steps=10 (production default)	`sample_actions(num_steps=10)`	2.38e-07	machine precision
GR00T N1.6 ONNX, single-step DiT (DDPM, loop external)	`GR00TFullStack.forward`	8.34e-07	machine precision
GR00T N1.6 end-to-end 4-step denoise loop	Python loop over PyTorch ref	4.77e-07	machine precision
GR00T N1.6 Eagle VLM ONNX (SigLIP + Qwen3 + mlp1, 1.87B)	`EagleExportStack` PyTorch	4.25e-04	machine precision
GR00T N1.6 DiT with real VLM KV (5-input `expert_stack_with_vlm.onnx`)	`GR00TFullStack(state, vlm_kv)`	1.78e-05	machine precision
GR00T N1.6 end-to-end two-ONNX chain (Eagle → DiT)	same chain in PyTorch	1.90e-05	parity + image-driven sensitivity verified
SmolVLA ONNX, num_steps=1	`sample_actions(num_steps=1)`	1.55e-06	machine precision
pi0 ONNX, num_steps=1	`sample_actions(num_steps=1)`	1.43e-06	machine precision

About the production defaults

Flow-matching VLAs (SmolVLA, pi0, pi0.5) canonically integrate the velocity field with 10 Euler steps — the ONNX bakes in the unrolled loop. GR00T is DDPM-style diffusion with 4 canonical steps — the ONNX exports one velocity step, and reflex serve wraps it in the loop. All four match canonical PyTorch to machine precision.

Memory fit on disk (monolithic ONNX, FP32)

Model	Disk size
SmolVLA	1.6 GB
pi0	12.5 GB
pi0.5	13.0 GB
GR00T	4.4 GB

SmolVLA fits comfortably on Orin Nano 8 GB. pi0 realistically needs Orin 16 GB+ or a desktop NVIDIA GPU — the 12.5 GB monolithic ONNX cannot load on the 8 GB Orin Nano even in FP16 (~6 GB weights plus activations + OS). FP16 engine rebuild + Orin Nano fit work is tracked for a future release.