tether chat

tether chat is a chat agent that runs the same tether commands you would. Useful for non-experts, useful for one-off operations, useful for “I forgot the flag for X.”

Start a session

tether chat

you › what version am I running and what hardware can I deploy to?

  → show_version({})    → tether --version          → "tether 0.12.0"
  → list_targets({})    → tether inspect targets    → [orin-nano, orin, orin-64, thor, desktop]

You're running tether 0.12.0. Supported targets:
  - orin-nano — Jetson Orin Nano: 8 GB, fp16
  - orin — Jetson AGX Orin 32GB: 32 GB, fp16
  - orin-64 — Jetson AGX Orin 64GB: 64 GB, fp16
  - thor — Jetson Thor: 128 GB, fp8
  - desktop — Desktop GPU (RTX 4090 / A100 / H100 / RTX 5090): 24 GB, fp16

Want me to show which models support each target, or run tether doctor?

The chat understands the full CLI. Ask it to deploy a model, run benchmarks, validate exports, run tether doctor, fetch model info — anything you would otherwise type.

How it works

Powered by GPT-5 Mini through a hosted proxy at chat.fastcrest.com. The 17 tools available to it map 1:1 onto tether CLI subcommands, and each tool runs the actual subprocess on your machine. The chat doesn’t see your data or your filesystem — it can only ask tether questions and read what tether prints back.

Free tier: 100 calls per day per machine. No signup, no API key.

Available tools

The chat agent has access to 17 tools. Tool names map onto canonical tether subcommands via src/tether/chat/executor.py:

#	Tool	Underlying command
1	`show_version`	`tether --version`
2	`show_status`	`tether status`
3	`show_config`	`tether config show`
4	`list_targets`	`tether inspect targets`
5	`list_models`	`tether models list`
6	`model_info`	`tether models info <id>`
7	`pull_model`	`tether models pull <id>`
8	`export_model`	`tether export <hf_id>`
9	`serve_model`	`tether serve <export_dir>`
10	`deploy_one_command`	`tether go --model <hf_id>`
11	`benchmark`	`tether bench <export_dir>`
12	`evaluate`	`tether eval <export_dir>`
13	`doctor`	`tether doctor`
14	`distill`	`tether train distill`
15	`finetune`	`tether train finetune`
16	`list_traces`	`tether traces query`
17	`replay_trace`	`tether replay <trace>`

The routing layer is covered by tests/test_chat_regression.py (structural pins on tool-name → CLI-command mapping) plus tests/test_chat_tools_executable.py (smoke).

Privacy

Conversations stream through chat.fastcrest.com, a Cloudflare Worker that does no logging beyond what’s needed for rate-limit accounting (anonymous IP hash + day bucket, rolling 24h). The Worker forwards to OpenAI; OpenAI’s data-retention policies apply to the prompts and responses.

If you don’t want your prompts to leave your machine, use the bring-your-own-key path with FASTCREST_PROXY_URL pointing at an OpenAI-compatible endpoint you control.

When NOT to use

If you’re running CI or non-interactive scripts — call the underlying tether commands directly. The chat is for ambiguous, exploratory work.
If you need deterministic exit codes — the chat returns natural-language answers, not exit codes.
If you’re on an air-gapped robot — the chat needs network access to reach the proxy.

For everything else, tether chat saves typing.