T A R L o c o
🐾 Teacher-Aligned Representations via Contrastive Learning for Quadrupedal Locomotion
Amr Mousa1, Neil Karavis2, Michele Caprio1, Wei Pan1, Richard Allmendinger1
1 University of Manchester, UK |
2 BAE Systems, UK
Conference:
IROS 2025 (Accepted)
📢 News
- 2025‑08‑18 — Project website goes live with videos and results.
- 2025‑06‑15 — Accepted at IROS 2025.
📌 Abstract
Quadrupedal locomotion via reinforcement learning (RL) is commonly addressed using the teacher–student paradigm, where a privileged teacher guides a proprioceptive student policy. However, key challenges such as representation misalignment between privileged teacher and proprioceptive‑only student, covariate shift due to behavioral cloning, and lack of deployable adaptation lead to poor generalization in real‑world scenarios. We propose Teacher‑Aligned Representations via Contrastive Learning (TAR), a framework that leverages privileged information with self‑supervised contrastive learning to bridge this gap. By distilling from a privileged teacher in simulation and constructing structured latent spaces through contrastive objectives, our student policy surpasses the fully privileged “Teacher” and exhibits robust generalization to out‑of‑distribution (OOD) scenarios. Results showed 2× faster training to reach peak performance compared to state‑of‑the‑art baselines, and ~40% better OOD generalization on average. Additionally, TAR transitions seamlessly into privileged‑free fine‑tuning during deployment, enabling continual adaptation in the real world.
⚙️ Training Framework Overview
The core idea of our method is to leverages contrastive learning to align latent representations between a privileged teacher and a proprioceptive student within RL paradigm. By structuring a shared latent space, the student utilizes the teacher’s privileged signals during training, enabling improved generalization and sim2real transfer. At deployment, the student operates with proprioception only, maintaining robust performance in diverse and dynamic environments.

Pipeline summary
- Teacher encoder consumes privileged states $S_t$ to produce structured embeddings $Z^{T}_t$.
- Student encoder consumes proprioceptive inputs $O_t$ and hidden state $h_{t-1}$ to produce $Z^{S}_t$.
- Contrastive alignment (triplet loss): the student’s next-state prediction $\tilde{Z}^{+}_{t+1}$ is pulled towards the teacher’s future code $Z_{t+1}$ and away from negatives $Z^{-}_{t+1}$ sampled from other contexts.
- Policy optimization: actor–critic is trained with policy gradients; the critic additionally leverages the contrastive signal for representation shaping.
- Velocity estimator: trained via regression and frozen post-training to stabilize deployment.
Design goals
- Robust latent structure that transfers to diverse terrains and dynamics.
- Student policy that remains privileged-free at test time without performance collapse.
🎬 Evaluation
In-Distribution Testing
Ours
Error: = 0.29
HIM
Error: = 0.32 (-10.3%)
SLR
Error: = 0.42 (-44.8%)
Out-of-Distribution Testing
Ours
Error: = 0.39
HIM
Error: = 0.47 (-21%)
SLR
Error: = 0.63 (-64.53%)
🐾 Real-World Deployment
The videos below showcase TARLoco in action on the Unitree Go2 robot, completely BLIND 🧑🏻🦯.
Dense Vegetation
Different Terrains
High-Step Descent
External Pushes
Soft Mattress
10kg Payload
Joint Degradation
* Simulating actuator degradation by reducing the joint torque by 90%. Inspired by ADAPT—but without custom policy training, we just let the robot figure it out 😎!
📚 Citation
If you find this work useful, please consider citing our paper:
@misc{mousa2025tar,
title={TAR: Teacher-Aligned Representations via Contrastive Learning for Quadrupedal Locomotion},
author={Amr Mousa and Neil Karavis and Michele Caprio and Wei Pan and Richard Allmendinger},
year={2025},
eprint={2503.20839},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2503.20839},
}
🙏🏻 Acknowledgments and Community
Special thanks to Bruno Adorno, Amy Johnson, Darren Cunningham and Lesley Pater from the University of Manchester for providing hardware for testing and their unwavering support.
This work builds upon IsaacLab, RSL-RL and the broader research community. The original licenses apply; new contributions are under CC BY-NC-SA 4.0.
For technical questions and implementation support:
- GitHub Issues: Report bugs and request features
- Discussions: Ask questions and share experiences
- Email: Direct contact for collaboration opportunities
🚀 Ready to Transform Your Quadrupedal Robotics Research?
Discover how TAR can improve your approach to sim2real transfer and robust locomotion.
⭐ Star on GitHub 📄 Read the Paper