Robotics Simulation Initiative

Humanoid Robot Training Pipeline with NVIDIA Isaac Sim and Isaac Lab

We established a remote GPU workflow for training and reviewing humanoid robot behavior in simulation. The result is a practical foundation for moving from early locomotion experiments toward a repeatable robot learning pipeline.

Step-by-Step Remote Cloud Setup Guide

Because Isaac Sim and Isaac Lab cannot run natively on a Mac for this workflow, the recommended setup is to use the Mac as the control machine and run simulation, training, checkpoints, metrics, and videos on a remote Ubuntu GPU instance.

1. Rent or provision a remote GPU instance

2. Connect from the Mac

Use the Mac for SSH, editing, monitoring, and downloading results. VS Code Remote SSH is the easiest way to edit files directly on the server.

ssh ubuntu@<GPU_SERVER_IP>

sudo apt update
sudo apt install -y tmux
tmux new -s isaac-training

3. Verify the GPU

Before installing Isaac Sim or Isaac Lab, confirm that the cloud machine can see the NVIDIA GPU.

nvidia-smi

4. Install base system tools

sudo apt update
sudo apt install -y git git-lfs cmake build-essential wget curl unzip ffmpeg

git lfs install

5. Create a clean Python environment

Use a dedicated Python environment so Isaac Sim, Isaac Lab, PyTorch, and reinforcement learning dependencies stay isolated.

conda create -n env_isaaclab python=3.11 -y
conda activate env_isaaclab
pip install --upgrade pip

6. Install Isaac Sim and PyTorch

pip install "isaacsim[all,extscache]==5.1.0" --extra-index-url https://pypi.nvidia.com
pip install -U torch==2.7.0 torchvision==0.22.0 --index-url https://download.pytorch.org/whl/cu128

7. Launch Isaac Sim once

Run Isaac Sim once to download and cache required extensions. The first launch can take several minutes and may ask you to accept the NVIDIA Omniverse license agreement.

isaacsim

# If prompted, type: Yes

8. Clone and install Isaac Lab

git clone https://github.com/isaac-sim/IsaacLab.git --branch main
cd IsaacLab
./isaaclab.sh --install

9. Verify Isaac Lab

./isaaclab.sh -p scripts/tutorials/00_sim/create_empty.py

# Stop with Ctrl+C after the simulator launches successfully.

10. Run a first headless training test

Start with a known built-in task to confirm the reinforcement learning loop works before running humanoid training.

./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task=Isaac-Ant-v0 --headless

11. Find and run the Unitree G1 task

List installed G1 environments first, then use the exact task ID returned by the command.

./isaaclab.sh -p scripts/environments/list_envs.py --keyword G1

./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task=<G1_TASK_ID> --headless

12. Save and review results

scp -r ubuntu@<GPU_SERVER_IP>:/path/to/IsaacLab/logs ./isaaclab-logs

13. Stop the instance when idle

Cloud GPU time can become expensive, so stop the instance when it is not actively installing, training, or generating review artifacts. Keep persistent storage attached so the work can resume later.

Why We Built This

Humanoid robotics requires a large amount of simulation, iteration, and validation before any policy should be considered for real hardware. Local workstations can become expensive and hard to scale, so we used a cloud GPU setup to make high-performance simulation available without committing to a dedicated desktop machine.

The first objective was intentionally narrow: prove that we can run a modern humanoid environment, start reinforcement learning, and capture visual evidence of training progress. That gives the team a working baseline before adding more complex tasks such as rough terrain, manipulation, perception, and language-conditioned behavior.

What We Set Up

Remote GPU Workstation

A cloud GPU instance acts as the simulation and training machine, giving us access to accelerated physics, rendering, and learning workloads.

Isaac Sim Runtime

Isaac Sim provides the robotics simulation environment, including physics, robot assets, scene rendering, and the underlying Omniverse runtime.

Isaac Lab Training Layer

Isaac Lab provides the robot learning framework used to run reinforcement learning environments, collect metrics, and produce trained policies.

What We Accomplished

Remote simulation is working The cloud GPU instance can launch Isaac Sim through Isaac Lab and run robot simulation workloads.
Humanoid environment is running We selected the Unitree G1 humanoid locomotion environment as the initial training target.
Training has started The reinforcement learning loop is producing learning iterations, rewards, episode statistics, and task metrics.
Visual review is possible We can record training videos and inspect behavior after or during short debug runs.

Key Design Choices

Start with a known humanoid model

We chose a built-in Unitree G1 environment rather than starting with a custom robot. This reduces early uncertainty and lets us validate the simulation, learning, and review workflow before introducing custom mechanical design decisions.

Use headless training for performance

The main training loop should run without an interactive viewer. This keeps GPU resources focused on simulation and learning instead of real-time rendering.

Use recorded videos for review

Cloud livestreaming can be sensitive to networking and firewall constraints. Recording videos gives the team a reliable way to review behavior, share progress, and compare policy quality across runs.

The important milestone is not visual polish; it is that the pipeline can repeatedly train a humanoid policy, produce metrics, and generate artifacts the team can evaluate.

Pipeline We Are Building Toward

1 Robot Model Begin with an existing humanoid asset, then later evaluate custom robot descriptions and hardware constraints.
2 Simulation Tasks Define locomotion, balance, terrain, manipulation, and recovery tasks inside simulation.
3 Policy Training Train policies using reinforcement learning, then compare runs using reward curves and behavior videos.
4 Evaluation Stress-test trained policies across scenarios, terrain variation, disturbances, and randomized conditions.
5 Deployment Path Prepare the bridge from simulation policies to robot software, hardware testing, and eventually real-world deployment.

Expected Next Steps

  1. Continue training the baseline Unitree G1 locomotion task until it produces a useful checkpoint.
  2. Record videos from trained checkpoints so the team can evaluate qualitative progress over time.
  3. Track metrics across runs, including reward, episode length, velocity tracking error, and termination causes.
  4. Move from flat-ground walking to rough-terrain locomotion once baseline walking improves.
  5. Add domain randomization to make policies less brittle and more relevant to real-world variation.
  6. Evaluate manipulation and whole-body tasks using additional open-source humanoid environments and Unitree task examples.
  7. Define the eventual robot software interface, likely using ROS 2 for integration with sensors, control, and hardware systems.
  8. Explore higher-level humanoid behavior models, including NVIDIA GR00T-style workflows, after the low-level simulation and training pipeline is reliable.

Why This Matters

This setup gives the team a repeatable way to learn, test, and review humanoid robot behavior before taking on the cost and risk of physical hardware. It also creates a foundation for disciplined experimentation: every change to the robot model, training task, reward function, or environment can be evaluated through metrics and videos.

The current state is an early but meaningful milestone: the training infrastructure works, a humanoid task is running, and we have a way to inspect the results. The next phase is to turn this into a managed experimentation pipeline with saved checkpoints, comparable runs, richer tasks, and a clear path toward sim-to-real validation.

Nvidia Isac Tutorial

End-to-End Cloud Tutorial: Running Isaac Sim and Isaac Lab from a Mac

Goal: use a Mac only as the browser/control machine while NVIDIA Brev runs Isaac Sim, Isaac Lab, training jobs, checkpoints, and livestreamed simulation on a remote NVIDIA GPU instance.

Recommended setup: NVIDIA Brev → Isaac Launchable by sreetz → browser-based VS Code → Isaac Sim viewer at /viewer → Isaac Lab training and playback commands from the cloud terminal.

1. Choose the right cloud launchable

In NVIDIA Brev Launchables, select Isaac Launchable. This option is designed to provide Isaac Sim and Isaac Lab in a browser-based workflow, with one tab for VS Code and one tab for the streamed Isaac Sim user interface.

2. Deploy the instance

  1. Click Deploy Now or Deploy Launchable.
  2. Wait until Brev reports that the instance is running, built, and setup has completed.
  3. Open the Brev instance page.
  4. Find the Using Secure Links section.
  5. Open the shareable URL and log in with your NVIDIA Brev account.
  6. You should land inside a browser-based VS Code environment.

3. Understand the two browser tabs

# Example VS Code URL:
https://isaac-pupyzgohq.brevlab.com/?folder=/workspace

# Matching viewer URL:
https://isaac-pupyzgohq.brevlab.com/viewer
Important: keep only one /viewer tab open. The launchable is intended for a single viewer session at a time.

4. Accept the Isaac Sim license

The first time Isaac Sim runs, it may refuse to start until the NVIDIA Isaac Sim Additional Software and Materials License is accepted through an environment variable.

export ACCEPT_EULA=Y

To make this persist for future terminals in the same environment, add it to ~/.bashrc.

echo 'export ACCEPT_EULA=Y' >> ~/.bashrc
source ~/.bashrc

5. Start plain Isaac Sim and verify streaming

Use this only when you want to open the Isaac Sim UI by itself. This is not required during headless training.

/isaac-sim/runheadless.sh

Wait until the terminal prints a line similar to:

[18.942s] app ready

Then open or refresh the viewer tab:

https://isaac-pupyzgohq.brevlab.com/viewer
If Isaac Sim is already running from another terminal, do not start another copy. Only one Isaac Sim or Isaac Lab streaming process should run at a time.

6. Stop plain Isaac Sim before policy playback

When moving from plain Isaac Sim to an Isaac Lab playback command, stop the existing /isaac-sim/runheadless.sh process first.

# In the terminal running /isaac-sim/runheadless.sh:
Ctrl+C

For policy playback, /isaac-sim/runheadless.sh should not be running anywhere. The play.py command launches its own Isaac Sim session with livestreaming enabled.

7. Run a quick training smoke test with Isaac Ant

Before training a humanoid, run the lightweight Ant task. Isaac Ant is a simple four-legged reinforcement-learning benchmark robot with eight actuated joints. It trains quickly and proves the cloud pipeline works.

export ACCEPT_EULA=Y
python isaaclab/scripts/reinforcement_learning/skrl/train.py --task=Isaac-Ant-v0 --headless

A successful run shows environment setup messages, SKRL/PPO logging, and a progress bar similar to:

[INFO]: Completed setting up the environment...
[skrl:INFO] Environment wrapper: Isaac Lab (single-agent)
100%|████████████████████████| 36000/36000 [...]

8. Confirm checkpoints were created

After training, check that logs and model checkpoints were saved under /workspace/logs.

find /workspace -type d -name "logs" 2>/dev/null
find /workspace -type f -name "*.pt" 2>/dev/null | head

For the Ant smoke test, a successful result should look similar to:

/workspace/logs/skrl/ant/<timestamp>_ppo_torch/checkpoints/best_agent.pt
/workspace/logs/skrl/ant/<timestamp>_ppo_torch/checkpoints/agent_800.pt
/workspace/logs/skrl/ant/<timestamp>_ppo_torch/checkpoints/agent_1600.pt

9. Play back the trained Ant policy

Stop any existing Isaac Sim process first, then run the Ant playback command with livestreaming.

export ACCEPT_EULA=Y
python isaaclab/scripts/reinforcement_learning/skrl/play.py --task=Isaac-Ant-v0 --livestream 2 \
  --checkpoint /workspace/logs/skrl/ant/<timestamp>_ppo_torch/checkpoints/best_agent.pt

After the terminal prints app ready or Simulation App Startup Complete, refresh the viewer tab at /viewer.

10. Find the Unitree G1 humanoid tasks

Once the Ant test works, list the available G1 environments. The tested launchable included both flat-ground and rough-terrain Unitree G1 locomotion tasks.

python isaaclab/scripts/environments/list_envs.py --keyword G1

Relevant task IDs from the tested environment:

Start with Isaac-Velocity-Flat-G1-v0. Flat-ground walking is the clean baseline before moving to rough terrain.

11. Train the Unitree G1 flat-ground locomotion task

Run the G1 training task headlessly. Headless training keeps GPU resources focused on simulation and reinforcement learning instead of rendering.

export ACCEPT_EULA=Y
python isaaclab/scripts/reinforcement_learning/skrl/train.py --task=Isaac-Velocity-Flat-G1-v0 --headless

This trains a policy for Unitree G1 humanoid flat-ground velocity tracking: the simulated humanoid learns to follow movement commands on flat terrain.

12. Keep training alive after closing the browser

Closing the browser is okay only if the training process keeps running on the cloud machine. Use tmux so the job survives browser disconnects.

tmux new -s g1-training

export ACCEPT_EULA=Y
python isaaclab/scripts/reinforcement_learning/skrl/train.py --task=Isaac-Velocity-Flat-G1-v0 --headless

Detach from tmux before closing the browser:

Ctrl+B
D

Later, reconnect to the training session:

tmux attach -t g1-training
Do not stop the Brev instance while training. Closing the browser is fine; stopping the instance stops the job.

13. Find the G1 checkpoint

After G1 training completes, locate the best checkpoint.

find /workspace/logs -type f -name "best_agent.pt" | grep -i g1

If the folder names do not contain g1, inspect the latest SKRL log folders:

find /workspace/logs/skrl -maxdepth 3 -type d | sort | tail -50
find /workspace/logs -type f -name "best_agent.pt" | sort | tail -10

14. Play back the trained G1 policy

Stop training and any other Isaac Sim process first, then run the G1 playback task with livestreaming and the trained checkpoint.

export ACCEPT_EULA=Y
G1_CKPT=$(find /workspace/logs -type f -name "best_agent.pt" | grep -i g1 | tail -n 1)

python isaaclab/scripts/reinforcement_learning/skrl/play.py \
  --task=Isaac-Velocity-Flat-G1-Play-v0 \
  --livestream 2 \
  --checkpoint "$G1_CKPT"

Wait for the terminal to report that the app is ready, then refresh the viewer:

https://isaac-pupyzgohq.brevlab.com/viewer

15. View G1 without training

To see the G1 task without actively training, run the play task. This launches the simulation in viewer mode. Use a checkpoint if you want to see a trained policy; omit the checkpoint only for basic environment visualization.

export ACCEPT_EULA=Y
python isaaclab/scripts/reinforcement_learning/skrl/play.py --task=Isaac-Velocity-Flat-G1-Play-v0 --livestream 2

16. Try different policies, libraries, and experiments

In this workflow, “trying different policies” can mean changing the RL backend, the task, or the hyperparameters.

First, inspect which RL training backends are installed:

ls isaaclab/scripts/reinforcement_learning

Then compare runs. Example experiment matrix:

# Same task, different seeds
python isaaclab/scripts/reinforcement_learning/skrl/train.py --task=Isaac-Velocity-Flat-G1-v0 --headless --seed 1
python isaaclab/scripts/reinforcement_learning/skrl/train.py --task=Isaac-Velocity-Flat-G1-v0 --headless --seed 2
python isaaclab/scripts/reinforcement_learning/skrl/train.py --task=Isaac-Velocity-Flat-G1-v0 --headless --seed 3

# Different backend, if installed
python isaaclab/scripts/reinforcement_learning/rsl_rl/train.py --task=Isaac-Velocity-Flat-G1-v0 --headless

# Harder terrain task
python isaaclab/scripts/reinforcement_learning/skrl/train.py --task=Isaac-Velocity-Rough-G1-v0 --headless

17. Useful inspection commands

# Show available environments
python isaaclab/scripts/environments/list_envs.py

# Search for humanoid-related environments
python isaaclab/scripts/environments/list_envs.py --keyword G1
python isaaclab/scripts/environments/list_envs.py --keyword H1
python isaaclab/scripts/environments/list_envs.py --keyword Humanoid
python isaaclab/scripts/environments/list_envs.py --keyword Unitree

# Find checkpoints
find /workspace/logs -type f -name "*.pt" | sort | tail -20

# Find best checkpoints
find /workspace/logs -type f -name "best_agent.pt" | sort

# Check GPU
nvidia-smi

18. Troubleshooting notes from the working session

19. Cost control checklist

20. Proven pipeline status

This workflow successfully demonstrated the full cloud robotics loop: launch Isaac Sim in the browser, train Isaac Ant, save checkpoints, play back a trained policy, list G1 environments, train Unitree G1 flat-ground locomotion, and view the G1 policy through the streamed Isaac Sim renderer.

Milestone achieved: Mac → NVIDIA Brev → Isaac Launchable → Isaac Lab training → checkpoint artifacts → Isaac Sim livestream playback.