run-policy

Runs a trained policy autonomously on the robot using locally attached ZED cameras and LeRobot’s async inference (lerobot.async_inference). By default a PolicyServer is launched in a child process on localhost; pass --server_host to use a remote inference-server on a more powerful machine instead. Either way the parent drives a thin RobotClient subclass that streams observations (joint positions + camera frames) to the server and consumes the returned action chunks. Type s on stdin to save the rollout and end the episode, r to discard and re-record, or q to discard and quit. --episode_time_s is a safety cap that falls back to the same [Enter]=save / r / q prompt when no key has been pressed.

For the end-to-end workflow — including offloading inference to a GPU machine — see the Run Policy guide.

This command is configured via draccus. The robot subsystem is exposed as the nested robot_config (cameras, per-joint gains) — nest into it with dots or pass a whole-config file. See Command configuration.

Flag	Description
`--policy_path PATH_OR_REPO`	Local checkpoint path or HuggingFace repo ID (required)
`--policy_type {act,smolvla,diffusion,tdmpc,vqbet,pi0,pi05,groot}`	Policy architecture; must match the checkpoint at `--policy_path` (required)
`--task TEXT`	Natural language task description (required)
`--episode_time_s INT`	Safety cap on episode duration in seconds (default: 120). Episodes normally end on operator keypress
`--fps INT`	Control loop frame rate (default: 60). Must match the fps the policy was trained on
`--repo_id <user>/<dataset>`	Optional dataset repo ID to save rollouts
`--root PATH`	Local dataset root (default: `$HF_LEROBOT_HOME`)
`--push_to_hub true`	Push rollout dataset to HuggingFace Hub when done
`--vcodec STR`	Video codec recorded for saved rollout datasets (default: per-platform — `h264` on Jetson/aarch64, `auto` elsewhere). Only used with `--repo_id`.
`--device STR`	PyTorch device for policy inference (default: `cuda`)
`--server_host HOST`	Address of a remote `inference-server`. Default `null`: a `PolicyServer` child process is spawned on localhost. When set, `--policy_path` must be reachable from the server (e.g. a HF Hub repo ID)
`--server_port INT`	Policy server port — for the localhost child process or the remote server (default: 8765)
`--actions_per_chunk INT`	Number of actions returned per inference call (default: 50); capped by the policy’s max action horizon
`--chunk_size_threshold FLOAT`	Trigger a fresh observation when the action queue drops to this fraction of a chunk (default: 0.9)
`--aggregate_fn {temporal_ensemble,weighted_average,latest_only,average,conservative}`	Action chunk aggregation strategy (default: `temporal_ensemble`, ACT Algorithm 2; gripper indices take the newest chunk). The other choices are upstream scalar blends
`--temporal_ensemble_coeff K`	Decay coefficient for `temporal_ensemble` (default: 0.01, ACT paper). `wᵢ = exp(-K·i)`, `i=0` oldest chunk; `K>0` smoother, `K=0` uniform, `K<0` more reactive
`--robot_config.cameras DICT`	The camera slots (`overhead` / `left_arm` / `right_arm`) to use, with each ZED camera’s serial number, as one inline YAML/JSON value: `--robot_config.cameras "{overhead: {serial: 41234567}, left_arm: {serial: 41234568}, right_arm: {serial: 41234569}}"`. At least one slot must be assigned (assign the cameras the policy was trained on); unassigned slots are dropped.
↳ `stereo` (auto-detected)	Stereo ZED X cameras are detected from their serial — on the CLI and the control panel — so you never flag it by hand. Auto-detection applies the head/wrist eye convention: a stereo overhead feeds the policy the `overhead_left` / `overhead_right` observation keys, while a stereo wrist (`left_arm` / `right_arm`) feeds a single left eye under the plain slot name, exactly like a mono camera. Setting `stereo: true` by hand skips auto-detection and uses the `eyes` field instead (default `both` → `X_left` / `X_right`; `left` / `right` → a single eye under the plain name `X`). Either way, configure the cameras to match the dataset the policy was trained on.
`--robot_config.video_backend {auto,gst,sdk}`	Camera capture backend. Defaults to `sdk` here: inference streams no headset video, so the GPU-resident gst encode branch would be wasted. Pass `gst` (or `auto`) to opt into the zed-gstreamer pipeline anyway.
`--robot_config.axol_config.<side>.gripper.torque_limit FLOAT`	Max torque (Nm) for a gripper in POSITION_FORCE mode (default: 0.5); `<side>` is `left`/`right`
`--robot_config.axol_config.left_stiffness S`	Compliance↔stiffness blend for the left arm in `[0, 1]`. Scalar or 7-element list (one per arm joint, in `Joint` enum order). `0` = fully compliant; `1` = pre-tuning industrial gains; `0.5` (default) is the geometric mean. Should match the value used at data collection time. Use `right_stiffness` for the right arm. See `AxolConfig.left_stiffness`.
`--rerun_ip IP`	IP of a Rerun viewer on your local machine for live visualization
`--rerun_port INT`	Rerun viewer port (default: 9876); only used when `--rerun_ip` is set
`--log_level {DEBUG,INFO,WARNING,ERROR}`	Default: `INFO`
`--config_path PATH`	Load a whole-config JSON/YAML file; CLI overrides layer on top.

axol run-policy --policy_path myorg/pick-place-policy --policy_type act --task "Pick the red cube" \
    --robot_config.cameras "{overhead: {serial: 41234567}, left_arm: {serial: 41234568}, right_arm: {serial: 41234569}}"
axol run-policy --policy_path ./checkpoints/epoch_100 --policy_type smolvla --task "Stack blocks" \
    --device cpu \
    --robot_config.cameras "{overhead: {serial: 41234567}, left_arm: {serial: 41234568}, right_arm: {serial: 41234569}}"
axol run-policy --policy_path myorg/pick-place-policy --policy_type act --task "Pick the red cube" \
    --server_host 192.168.1.99 \
    --robot_config.cameras "{overhead: {serial: 41234567}, left_arm: {serial: 41234568}, right_arm: {serial: 41234569}}"

If --repo_id is supplied, each saved episode is appended to a LeRobot-format dataset using the same resume/refuse/wipe semantics as collect-data (resume a complete dataset, refuse an incomplete one, wipe a leftover empty directory). Between episodes the arms return to the rest pose via a collision-aware IK trajectory planned in a worker subprocess, mirroring the reset path used by collect-data.

Action chunk aggregation defaults to ACT’s Algorithm 2 (temporal_ensemble): every future timestep covered by the buffered chunks is the exponentially-weighted average across those chunks, with the gripper indices snapped to the newest contributing chunk so bang-bang grasp commands aren’t smeared. The control loop and observation send run on separate threads — decoupling the ~60-70 ms ZED-read + gRPC send from the 60 Hz action stream that would otherwise collapse to ~27 Hz on the upstream single-threaded design.

Get Started

Operations

Remote Teleop

Web Interfaces

Advanced