Sim-to-Real

Sim-to-real is the set of techniques used to make a policy trained in simulation work on real hardware. It includes domain randomisation, system identification, residual learning, world-model-based fine-tuning, and the broader engineering practice of closing the reality gap between a simulator's contact, sensor, and actuator models and the physical robot.

From an engineering standpoint, sim-to-real is the discipline that decides whether all the upstream investment in simulators, datasets, and training compute actually produces a deployable system. The reality gap is rarely a single problem — it is a stack of small mismatches in dynamics, latency, sensor noise, and actuator behaviour that only become visible under closed-loop control. Successful pipelines treat the gap as something to measure, not just bridge.

When choosing between techniques, match the gap source to the method: dynamics gaps respond well to domain randomisation and system identification; perception gaps respond to photoreal rendering and randomised sensor noise; long-tail behavioural gaps often need real-world fine-tuning or world-model adaptation. Always pair sim-to-real claims with at least one real-robot evaluation under controlled perturbations.

Start here

Domain Randomization (Tobin et al.) is the foundational reference for the field — read it first, even when using more modern methods, because every later technique is a refinement of the framing it introduced.

Domain Randomization (Tobin et al.) — Foundational paper on randomising simulator parameters for zero-shot transfer.
Learning Dexterous In-Hand Manipulation (OpenAI) — Sim-to-real dexterous manipulation via automatic domain randomization.
Sim-to-Real via Sim-to-Sim (Koos et al. line) — Bridging the reality gap with progressively more realistic intermediate simulators.
Eureka (NVIDIA) — LLM-driven reward design that enables sim-to-real transfer for dexterous skills.
DextrAH-G — Sim-to-real dexterous arm-hand grasping pipeline using GPU-parallel RL.
Real-World Humanoid Locomotion (Radosavovic et al.) — Sim-to-real RL for humanoid walking with strong robustness.
DeXtreme (NVIDIA) — Sim-to-real dexterous in-hand manipulation on the Allegro hand using massively parallel GPU simulation.
Automatic Domain Randomization — Curriculum-style randomization strategy for robust transfer without manual tuning.
SimOpt — Simulation parameter optimization framework for reducing real-world mismatch.
BayesSim — Bayesian domain randomization approach for data-efficient sim-to-real adaptation.
Residual Reinforcement Learning for Robot Control — Combines model-based controllers with learned residuals for stable transfer.
Learning Agile Flight in the Wild — Sim-to-real pipeline for high-speed quadrotor control under real-world disturbances.
Learning Robust Perceptive Locomotion (Miki et al.) — Science Robotics result combining proprioceptive and exteroceptive teachers for robust real-world transfer.
Privileged Learning for Rapid Motor Adaptation — Distillation strategy leveraging privileged simulation signals for robust real-world control.
SimGAN — Sim-to-real visual adaptation method for narrowing sensor-domain gaps.