|
a |
|
b/README.md |
|
|
1 |
# -NIPS-2017-Learning-to-Run |
|
|
2 |
Reinforcement learning environments with musculoskeletal models |
|
|
3 |
https://www.crowdai.org/challenges/nips-2017-learning-to-run |
|
|
4 |
|
|
|
5 |
CANDIDATE ALGORITHMS: |
|
|
6 |
|
|
|
7 |
Depp Deterministic Policy Gradient---DDPG(https://arxiv.org/abs/1509.02971) |
|
|
8 |
Recurrent Deterministic Policy Gradient---RDPG(https://arxiv.org/pdf/1512.04455.pdf) |
|
|
9 |
Trust |
|
|
10 |
MOTIVATION: |
|
|
11 |
In this case, we are manipulating the muscles rather than the velocity of the body parts, i.e., if I understood the problem correctly, we are changing the acceleration, which should be considered a second-order markov chain. LSTM and recurrent network might be able to capture the long short term dependencies. |
|
|
12 |
|
|
|
13 |
FOlDERS AND FILES: |
|
|
14 |
DDPG: The standard implementation of ddpg in tensorflow, following the pesudocode in the paper. |
|
|
15 |
RDPG: The recurrent version of the ddpg in tensorflow(There are complications that have not been resolved) |
|
|
16 |
|
|
|
17 |
DEPENDENCIES: |
|
|
18 |
Tensorflow |
|
|
19 |
numpy |
|
|
20 |
math |