This repo contains the code for the paper DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems paper, published at ICLR 2023 with perfect review scores (8, 8, 8, 10) and a notable-top-25% rating. See here for videos.
The work was performed by Pierre Schumacher, Daniel F.B. Haeufle, Dieter Büchler, Syn Schmitt and Georg Martius.
If you just want to see the code for DEP, take a look at deprl/dep_controller.py
, deprl/custom_agents.py
and deprl/env_wrapper/wrappers.py
We now provide code for our newest preprint, Natural and Robust Walking using Reinforcement Learning without Demonstrations in High-Dimensional Musculoskeletal Models. With this work, we take a step towards natural movement generation with RL.
This update provides code for adaptive energy costs in muscle-driven systems and provides support for the SCONE and Hyfydy softwares in the shape of the recently released sconegym environment suite.
The new features also include pre-trained baselines from the preprint, enabling rendering from SCONE and much more. See the docs for more information.
We encourage anyone encountering bugs or wanting help to open a GitHub issue or contact us directly. Let us help you make it work.
Muscle-actuated organisms are capable of learning an unparalleled diversity of
dexterous movements despite their vast amount of muscles. Reinforcement learning (RL) on large musculoskeletal models, however, has not been able to show
similar performance. We conjecture that ineffective exploration in large overactuated action spaces is a key problem. This is supported by our finding that common
exploration noise strategies are inadequate in synthetic examples of overactuated
systems. We identify differential extrinsic plasticity (DEP), a method from the
domain of self-organization, as being able to induce state-space covering exploration within seconds of interaction. By integrating DEP into RL, we achieve fast
learning of reaching and locomotion in musculoskeletal systems, outperforming
current approaches in all considered tasks in sample efficiency and robustness.
We provide a python package for easy installation:
pip install deprl
There are more instructions on installation from source, and other things, in the documentation
The ICLR publication includes experiments with human arms:
pip install git+https://github.com/P-Schumacher/warmup.git
and a bipedal ostrich. The OstrichRL environment can be installed from here.
We also collaborated with groups that provide musculoskeletal control environments and provide additional baselines as well as code from our latest preprints.
We include several pre-trained baselines and configuration files to train the policies from our newest preprint. These allow you to train walking agents in Hyfydy with RL for natural walking and robust running tasks. We worked together with Thomas Geijtenbeek @tgeijten to create a python environment interface for Hyfydy, called sconegym!
This repository also includes the definitions of all the cost terms we used, see here.
The configuration files to train our sconegym policies are included here.
Check out how to install sconegym from their repo, you can immediately start with a simple OpenSim model. To access the fast Hyfydy engine, and the complex 3D models, you need to request a trial license from the Hyfydy website or purchase a license. Some usage examples can be found here and here
If you are coming here for the MyoLeg, take a look at this tutorial. It will show you how to run the pre-trained baseline. We also explain how to train the walking agent in the MyoSuite documentation.
This repository has been updated with training files that have been used for the MyoSuite baselines, as well as pretrained networks.
Simply try training something:
`python -m deprl.main experiments/myosuite_training_files/myoChaseTag.json`
or render the pretrained baselines with:
`python experiments/myosuite_training_files/render_baselines.py`
You have to find your own reward function, of course. These files also require the installation of `myosuite==2.1.5`.
We will release a tutorial on how to implement all cost terms from our natural walking [preprint](https://sites.google.com/view/naturalwalkingrl) in the near future.
## Experiments
The major experiments (humanreacher reaching and ostrich running) can be repeated with the config files.
Simply run from the root folder:
python -m deprl.main experiments/ostrich_running_dep.json
python -m deprl.main experiments/humanreacher.json
python -m tonic.plot --path output/folder/
python -m deprl.play --path output/folder/
bash play_files/play_dep_humanreacher.sh
bash play_files/play_dep_ostrich.sh
bash play_files/play_dep_dmcontrol_quadruped.sh
import gym
import warmup
env = gym.make("humanreacher-v0")
for ep in range(5):
ep_steps = 0
state = env.reset()
while True:
next_state, reward, done, info = env.step(env.action_space.sample())
env.render()
if done or (ep_steps >= env.max_episode_steps):
break
ep_steps += 1
git clone https://github.com/martius-lab/depRL.git
cd depRL
pip install -r requirements.txt
pip install -e ./
git clone https://github.com/martius-lab/depRL
poetry install
poetry shell
Ubuntu 20.04 and Ubuntu 22.04
CUDA 12.0
poetry 1.4.0
export PYTHON_KEYRING_BACKEND=keyring.backends.null.Keyring
conda install -c conda-forge cudatoolkit-dev
@inproceedings{schumacher2023:deprl,
title = {DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems},
author = {Schumacher, Pierre and Haeufle, Daniel F.B. and B{\"u}chler, Dieter and Schmitt, Syn and Martius, Georg},
booktitle = {Proceedings of the Eleventh International Conference on Learning Representations (ICLR)},
month = may,
year = {2023},
doi = {},
url = {https://openreview.net/forum?id=C-xa_D3oTj6},
month_numeric = {5}
}
@misc{schumacher2023natural,
title={Natural and Robust Walking using Reinforcement Learning without Demonstrations in High-Dimensional Musculoskeletal Models},
author={Pierre Schumacher and Thomas Geijtenbeek and Vittorio Caggiano and Vikash Kumar and Syn Schmitt and Georg Martius and Daniel F. B. Haeufle},
year={2023},
eprint={2309.02976},
archivePrefix={arXiv},
primaryClass={cs.RO}
}