MCP server
Reflex exposes a Model Context Protocol server so MCP-compatible agents — Claude Desktop, Cursor, custom — can call a VLA policy as a tool. Additive to the HTTP API; both share the same inference engine.
Install
Section titled “Install”pip install 'reflex-vla[mcp]'Pulls fastmcp ≥ 3.0 alongside core dependencies.
Claude Desktop / Cursor integration
Section titled “Claude Desktop / Cursor integration”Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or the equivalent on Linux/Windows:
{ "mcpServers": { "reflex": { "command": "reflex", "args": [ "serve", "/path/to/your/exported/model/", "--mcp", "--mcp-transport", "stdio", "--embodiment", "franka" ] } }}Claude Desktop spawns Reflex as a subprocess; stdio means no ports, no firewall, no auth. Restart the app. reflex now appears in the tool picker.
/reflex act instruction: "pick up the red block" image_b64: "<base64 PNG from robot camera>" state: [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6] episode_id: "ep-2026-04-24-001"HTTP transport (for networked agents)
Section titled “HTTP transport (for networked agents)”reflex serve ./my-export/ \ --mcp \ --mcp-transport http \ --mcp-port 8001 \ --port 8000 \ --embodiment frankaBoth MCP (streamable-http on 127.0.0.1:8001) AND the REST API (0.0.0.0:8000) run concurrently. Same ReflexServer backs both.
For production HTTP deployment, front MCP with a reverse proxy that handles TLS + auth.
Available tools
Section titled “Available tools”| Tool | Inputs | Output |
|---|---|---|
act | instruction, image_b64, state, episode_id? | {actions, policy_version, inference_ms} or {error: ...} |
health | — | {state, model_version, uptime_seconds, cuda_graphs_active} |
models_list | — | [{model_id, hf_repo, family, action_dim, size_mb, supported_embodiments, supported_devices, license, description}] |
validate_dataset | dataset_path | Validation report with pass/warn/block decision |
Available resources
Section titled “Available resources”| URI | Content |
|---|---|
metrics://prometheus | Current Prometheus metrics in text exposition format (same as /metrics) |
Safety
Section titled “Safety”The act tool returns action chunks but does NOT actuate them. Callers are responsible for sending actions to the robot’s controller. Reflex’s act is pure inference.
Safety features that DO run inside act:
- ActionGuard from the loaded embodiment config (joint-limit clamping, velocity caps, torque caps)
- Per-request circuit breaker (
--max-consecutive-crashes) - Optional audit log (
--record <dir>)
ROS2 tools (ros2-mcp-bridge)
Section titled “ROS2 tools (ros2-mcp-bridge)”When running reflex ros2-serve alongside MCP, four ROS2 tools + one resource become available. They introspect and drive a real ROS2-connected robot through the MCP surface.
| Tool | Purpose | Cooldown | Requires confirm=True |
|---|---|---|---|
get_joint_state() | Latest /joint_states positions | 100 ms | — |
get_camera_frame() | Latest /camera/image_raw as base64 JPEG | 100 ms | — |
execute_task(instruction, confirm, max_steps?) | Runs one Reflex inference cycle + publishes the action chunk | 5 s | yes |
emergency_stop(confirm) | Publishes to /reflex/e_stop | 5 s | yes |
robot://status | URDF info + latest state + current task (resource) | — | — |
The confirm=True tripwire
Section titled “The confirm=True tripwire”execute_task and emergency_stop reject any call where confirm is not literally the boolean True — not truthy, not a string:
await execute_task(instruction="pick up the red block", confirm=True)# → runs inference + actuates
await execute_task(instruction="pick up the red block")# → {"error": {"kind": "ConfirmationRequired", ...}}
await execute_task(instruction="pick up", confirm="yes")# → {"error": {"kind": "ConfirmationRequired", ...}} # not identity-TrueIntentional friction: an agent forgetting to pass confirm=True can’t actuate by accident; a prompt-injection payload that tries to slip in confirm="yes" fails the identity check.
Rate limiting
Section titled “Rate limiting”Server-side cooldowns are authoritative. Limiter is per-tool (e-stop cooldown doesn’t block joint-state reads). Uses time.monotonic() so wall-clock skew doesn’t matter. Rejected calls return {"error": {"kind": "RateLimited", "message": "... wait Xs", ...}}. Well-behaved agents retry after the hint.
Troubleshooting
Section titled “Troubleshooting”“fastmcp not installed” — pip install reflex-vla[mcp]
Claude Desktop doesn’t list Reflex as a tool — verify the claude_desktop_config.json path. On macOS, quit + relaunch Claude Desktop fully (cmd-Q, not just close the window).
stdio mode blocks the terminal — by design. stdio owns stdin/stdout for MCP’s bidirectional framing. For interactive dev, use --mcp-transport http on a separate port.
MCP + FastAPI on same port — not supported. The two transports use different protocols. --port is FastAPI; --mcp-port is MCP HTTP; they must differ.